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Abstract 

Entanglement purification protocols (EPP) and quantum error- 
correcting codes (QECC) provide two ways of protecting quantum 
states from interaction with the environment. In an EPP, perfectly 
entangled pure states are extracted, with some yield D, from a mixed 
state M shared by two parties; with a QECC, an arbitrary quantum 
state |£) can be transmitted at some rate Q through a noisy chan- 
nel x without degradation. We prove that an EPP involving one-way 
classical communication and acting on mixed state M(x) (obtained 
by sharing halves of EPR pairs through a channel x) yields a QECC 
on x with rate Q = D, and vice versa. We compare the amount 
of entanglement E{M) required to prepare a mixed state M by lo- 
cal actions with the amounts D\(M) and D2{M) that can be locally 
distilled from it by EPPs using one- and two-way classical communi- 
cation respectively, and give an exact expression for E(M) when M is 
Bell-diagonal. While EPPs require classical communication, QECCs 
do not, and we prove Q is not increased by adding one-way classical 
communication. However, both D and Q can be increased by adding 
two-way communication. We show that certain noisy quantum chan- 
nels, for example a 50% depolarizing channel, can be used for reliable 
transmission of quantum states if two-way communication is available, 
but cannot be used if only one-way communication is available. We 
exhibit a family of codes based on universal hashing able to achieve 
an asymptotic Q (or D) of 1 — S for simple noise models, where S is 
the error entropy. We also obtain a specific, simple 5-bit single-error- 
correcting quantum block code. We prove that iff a. QECC results in 
high fidelity for the case of no error the QECC can be recast into a 
form where the encoder is the matrix inverse of the decoder. 

PACS numbers: 03.65.Bz, 42.50.Dv, 89.70. +c 
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1 Introduction 



1.1 Entanglement and nonlocality in quantum physics 

Among the most celebrated features of quantum mechanics is the Einstein- 
Podolsky- Rosen |l|] (EPR) effect, in which anomalously strong correlations 
are observed between presently noninteracting particles that have interacted 
in the past. These nonlocal correlations occur only when the quantum state 
of the entire system is entangled, i.e., not representable as a tensor product of 
states of the parts. In Bohm's version of the EPR paradox, a pair of spin-1/2 
particles, prepared in the singlet state 

*- = ^(IU>-UT», (i) 

and then separated, exhibit perfectly anticorrelated spin components when 
locally measured along any axis. Bell @ and Clauser et al. showed that 
these statistics violate inequalities that must be satisfied by any classical 
local hidden variable model of the particles' behavior. Repeated experimental 
confirmation (3| of the nonlocal correlations predicted by quantum mechanics 
is regarded as strong evidence in its favor. 

Besides helping to confirm the validity of quantum mechanics, entangle- 
ment has assumed an important role in quantum information theory, a role in 
many ways complementary to the role of classical information. Much recent 
work in quantum information theory has aimed at characterizing the chan- 
nel resources necessary and sufficient to transmit unknown quantum states, 
rather than classical data, from a sender to a receiver. To avoid violations 
of physical law, the intact transmission of a general quantum state requires 
both a quantum resource, which cannot be cloned, and a directed resource, 
which cannot propagate superluminally. The sharing of entanglement re- 
quires only the former, while purely classical communication requires only 
the latter. In quantum teleportation U the two requirements are met by two 
separate systems, while in the direct, unimpeded transmission of a quantum 
particle, they are met by the same system. Quantum data compression || 
optimizes the use of quantum channels, allowing redundant quantum data, 
such as a random sequence of two non-orthogonal states, to be compressed 
to a bulk approximating its von Neumann entropy, then recovered at the 
receiving end with negligible distortion. On the other hand, quantum super- 
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dense coding uses previously shared entanglement to double a quantum 
channel's capacity for carrying classical information. 

Probably the most important achievement of classical information theory 
is the ability, using error-correcting codes, to transmit data reliably through a 
noisy channel. Quantum error-correcting codes (QECC) || || [K| [II], [12|, [13], 



T4| , |15| , |16f use coherent generalizations of classical error-correction techniques 
to protect quantum states from noise and decoherence during transmission 
through a noisy channel or storage in a noisy environment. Entanglement 
purification protocols (EPP) JlTj achieve a similar result indirectly, by dis- 
tilling pure entangled states (e.g. singlets) from a larger number of impure 
entangled states (e.g. singlets shared through a noisy channel). The purified 
entangled states can then be used for reliable teleportation, thereby achiev- 
ing the same effect as if a noiseless storage or transmission channel had been 
available. The present paper develops the quantitative theory of mixed state 
entanglement and its relation to reliable transmission of quantum informa- 
tion. 
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Figure 1: Typical scenario for creation of entangled quantum states. At some 
early time and at location J, two quantum systems A and B interact [[1? 



then become spatially separated, one going to Alice and the other to Bob. 
The joint system's state lies in a Hilbert space 7i = 1~La <8> 7~Lb that is the 
tensor product of the spaces of the subsystems, but the state itself is not 
expressible as a product of states of the subsystems: T ^ ® T B . State 
T, its pieces acted upon separately by noise processes Na and N B , evolves 
into mixed state M. 



Entanglement is a property of bipartite systems — systems consisting of 
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two parts A and B that are too far apart to interact, and whose state, 
pure or mixed, lies in a Hilbert space 7i = T~Ca <8> 7~Lb that is the tensor 
product of Hilbert spaces of these parts. Our goal is to develop a general 
theory of state transformations that can be performed on a bipartite system 
without bringing the parts together. We consider these transformations to 
be performed by two observers, "Alice" and "Bob," each having access to 
one of the subsystems. We allow Alice and Bob to perform local actions, e.g. 
unitary transformations and measurements, on their respective subsystems 
along with whatever ancillary systems they might create in their own labs. 
Sometimes we will also allow them to coordinate their actions through one- 
way or two-way classical communication; however, we do not allow them to 
perform nonlocal quantum operations on the entire system nor to transmit 
fresh quantum states from one observer to the other. Of course two-way or 
even one-way classical communication is itself an element of nonlocality that 
would not be permitted, say, in a local hidden variable model, but we find that 
giving Alice and Bob the extra power of classical communication considerably 
enhances their power to manipulate bipartite states, without giving them so 
much power as to make all state transformations trivially possible, as would 
be the case if nonlocal quantum operations were allowed. We will usually 
assume that Ha and Hb have equal dimension N (no generality is lost, since 
either subsystem's Hilbert space can be embedded in a larger one by local 
actions). 



1.2 Pure-state entanglement 

For pure states, a sharp distinction can be drawn between entangled and 
unentangled states: a pure state is entangled or nonlocal if and only if its 
state vector T cannot be expressed as a product ® T B of pure states of its 
parts. It has been shown that every entangled pure state violates some Bell- 
type inequality |L9| , while no product state does. Entangled states cannot be 



prepared from unentangled states by any sequence of local actions of Alice 
and Bob, even with the help of classical communication. 

Quantitatively, a pure state's entanglement is conveniently measured by 
its entropy of entanglement, 

E(T) = S(p A ) = S(p B ), (2) 

the apparent entropy of either subsystem considered alone. Here S(p) = 
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— Trplog 2 p is the von Neumann entropy and pa = Tt_b|T)(T| is the reduced 
density matrix obtained by tracing the whole system's pure-state density 
matrix |T)(T| over Bob's degrees of freedom. Similarly p B = Tr^lTXTj is 
the partial trace over Alice's degrees of freedom. 

The quantity E, which we shall henceforth often call simply entanglement, 
ranges from zero for a product state to log 2 iV for a maximally-entangled state 
of two iV-state particles. E = 1 for the singlet state of Eq. ([!]), either of 
whose spins, considered alone, appears to be in a maximally-mixed state with 
1 bit of entropy. Paralleling the term qubit for any two-state quantum system 
(e.g. a spin-i particle), we define an ebit as the amount of entanglement in 
a maximally entangled state of two qubits, or any other pure bipartite state 
for which E — 1. 

Properties of E that make it a natural entanglement measure for pure 
states include: 

• The entanglement of independent systems is additive, n shared singlets 
for example having n ebits of entanglement. 

• E is conserved under local unitary operations, i.e., under any unitary 
transformation U that can be expressed as a product U = Ua <8> Ub of 
unitary operators on the separate subsystems. 

• The expectation of E cannot be increased by local nonunitary oper- 
ations: if a bipartite pure state T is subjected to a local nonunitary 
operation (e.g. measurement by Alice) resulting in residual pure states 
Tj with respective probabilities pj, then the expected entanglement of 
the final states J2j PjE(Tj) is no greater, but may be less, than the 
original entanglement E(T) ||20|| . In the present paper we generalize 



this result to mixed states: see Sec. 2.1 



Entanglement can be concentrated and diluted with unit asymptotic 
efficiency |2U[| , in the sense that for any two bipartite pure states T 
and T', if Alice and Bob are given a supply of n identical systems in 
a state T = (T) n , they can use local actions and one-way classical 
communication to prepare m identical systems in state Y' « (T') m , 
with the yield m/n approaching E(T)/E(T'), the fidelity |(Y'|(T') m )| 2 
approaching 1, and probability of failure approaching zero in the limit 
of large n. 
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With regard to entanglement, a pure bipartite state T is thus completely 
parameterized by E(T), with E(T) being both the asymptotic number of 
standard singlets required to locally prepare a system in state T — its "en- 
tanglement of formation" — and the asymptotic number of standard singlets 
that can be prepared from a system in state T by local operations — its "dis- 
tillable entanglement" . 



1.3 Mixed-state entanglement 

One aim of the present paper is to extend the quantitative theory of entangle- 
ment to the more general situation in which Alice and Bob share a mixed state 
M, rather than a pure state T as discussed above. Entangled mixed states 
may arise (cf. Fig. [I]) when one or both parts of an initially pure entangled 
state interact, intentionally or inadvertently, with other quantum degrees of 
freedom (shown in the diagram as noise processes Na and Nb and shown ex- 



plicity in quantum channel £ in Fig. [13]) resulting in a non-unitary evolution 



of the pure state T into a mixed state M. Another principal aim is to eluci- 
date the extent to which mixed entangled states, or the noisy channels used 
to produce them, can nevertheless be used to transmit quantum information 
reliably. In this connection we develop a family of one-way entanglement pu- 
rification protocols |T7| and corresponding quantum error-correcting codes, 



as well as two-way entanglement purification protocols which can be used 
to transmit quantum states reliably through channels too noisy to be used 
reliably with any quantum error-correcting code. 

The theory of mixed-state entanglement is more complicated and less 
well understood than that of pure-state entanglement. Even the qualita- 
tive distinction between local and nonlocal states is less clear. For exam- 
ple, Werner has described mixed states which violate no Bell inequality 
with regard to simple spin measurements, yet appear to be nonlocal in other 
subtler ways. These include improving the fidelity of quantum teleporta- 
tion above what could be achieved by purely classical communication [p^ 



and giving nonclassical statistics when subjected to a sequence of measure- 
ments [^3] . 



Quantitatively, no single parameter completely characterizes mixed state 
entanglement the way E does for pure states. For a generic mixed state, we 
do not know how to distill out of the mixed state as much pure entanglement 
(e.g. standard singlets) as was required to prepare the state in the first place; 
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moreover, for some mixed states, entanglement can be distilled with the help 
of two-way communication between Alice and Bob, but not with one-way 
communication. In order to deal with these complications, we introduce three 
entanglement measures Di(M) < D 2 (M) < E(M), each of which reduces to 
E for pure states, but at least two of which (Di and D 2 ) are known to be 
inequivalent for a generic mixed state. 

Our fundamental measure of entanglement, for which we continue to use 
the symbol E, will be a mixed state's entanglement of formation E(M), 
defined as the least expected entanglement of any ensemble of pure states 
realizing M. We show that local actions and classical communication can- 
not increase the expectation of E(M) and we give exact expressions for the 
entanglement of formation of a simple class of mixed states: states of two 
spin-| particles that are diagonal in the so-called Bell basis. This basis con- 
sists of four maximally-entangled states — the singlet state of Eq. (fll), and 
the three triplet states 

* + = ^(IU> + liT» (3) 
* ± = ^(ITT> ±111))- (4) 

We also give lower bounds on the entanglement of formation of other, more 
general mixed states. Nonzero E(M) will again serve as our qualitative 
criterion of nonlocality; thus, a mixed state will be considered local if can be 
expressed as a mixture of product states, and nonlocal if it cannot. 

By distillable entanglement we will mean the asymptotic yield of arbi- 
trarily pure singlets that can be prepared locally from mixed state M by 
entanglement purification protocols (EPP) involving one-way or two-way 
communication between Alice and Bob. Distillable entanglement for one- 
and two-way communication will be denoted Z?i(M) and D 2 (M), respec- 
tively. Except in cases where we have been able to prove that Di or D 2 is 
identically zero, we have no explicit values for distillable entanglement, but 
we will exhibit various upper bounds, as well as lower bounds given by the 
yield of particular purification protocols. 
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1.4 Entanglement purification and quantum error cor- 
rection 

Entanglement purification protocols (EPP) will be the subject of a large 
portion of this paper; we describe them briefly here. The most powerful 
protocols, depicted in Fig. [2], involve two-way communication. Alice and 
Bob begin by sharing a bipartite mixed state M = (M) n consisting of n 
entangled pairs of particles each described by the density matrix M, then 
proceed by repeated application of three steps: 1) Alice and Bob perform 
unitary transformations on their states; 2) They perform measurements on 
some of the particles; and 3) They share the results of these measurements, 
using this information to choose which unitary transformations to perform 
in the next stage. The object is to sacrifice some of the particles, while 
maneuvering the others into a close approximation of a maximally entangled 
state such as Y = (\l/ _ ) m , the tensor product of m singlets, where < m < n. 
No generality is lost by using only unitary transformations and von Neumann 
measurements in steps 1) and 2), because Alice and Bob are free at the outset 
to enlarge the Hilbert spaces Ha and Ha to include whatever ancillas they 
might need to perform nonunitary operations and generalized measurements 
on the original systems. 

A restricted version of the purification protocol involving only one-way 
communication is illustrated in Fig. ||. Here, without loss of generality, we 
permit only one stage of unitary operation and measurement, followed by a 
one-way classical communication. The principal advantage of such a protocol 
is that the components of the resulting purified maximally entangled state 
indicated by (*) can be separated both in space and in time. In Sees. ^| and |6] 
we show that the time-separated EPR pairs resulting from such a one-way 
protocol (1-EPP) always permit the creation of a quantum error-correction 
code (QECC) whose rate and fidelity are respectively no less than the yield 
m/n and fidelity of the purified states produced by the 1-EPP. 

The link between 1-EPP and QECC is provided by quantum teleportation|5|]. 
As Fig. |] illustrates, the availability of the time-separated EPR state (*) 
means that an arbitrary quantum state |£) (in a Hilbert space no larger than 
2 m ) can be teleported forward in time: the teleportation is initiated with 
Alice's Bell measurement and is completed by Bob's unitary transfor- 
mation C/4. The net effect is that an exact replica of |£) reappears at the 
end, despite the presence of noise (Na,b) in the intervening quantum envi- 
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Figure 2: Entanglement purification protocol involving two-way classical 
communication (2-EPP). In the basic step of 2-EPP, Alice and Bob sub- 
ject the bipartite mixed state to two local unitary transformations U\ and 
1/2- They then measure some of their particles A4, and interchange the re- 
sults of these measurements (classical data transmission indicated by double 
lines). After a number of stages, such a protocol can produce a pure, near- 
maximally-entangled state (indicated by *'s). 
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Figure 3: One-way Entanglement Purification Protocol (1-EPP). In 1-EPP 
there is only one stage; after unitary transformation U\ and measurement 
Ai, Alice sends her classical result to Bob, who uses it in combination with 
his measurement result to control a final transformation C/3. The unidirec- 
tionality of communication allows the final, maximally-entangled state (*) to 
be separated both in space and in time. 
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Figure 4: If the 1-EPP of Fig. |3] is used as a module for creating time- 
separated EPR pairs (*), then by using quantum teleportation||, an ar- 
bitrary quantum state |£) may be recovered exactly after U4, despite the 
presence of intervening noise. This is the desired effect of a quantum error 
correcting code (QECC). 



ronment. Moreover, we will show in detail in Sec. |6] that the protocol of 
Fig. f| can be converted into a much simpler protocol with the same quan- 
tum communication capacity but involving neither entanglement nor classical 
communication, and having the topology of a quantum error correcting code 
(Fig. |TD [|, |, 0, 0, 0, 0, |I§ . 



Many features of mixed-state entanglement, along with their consequences 
for noisy-channel coding, are illustrated by a particular mixed state, the 



Werner state 21 



w 5/s = ~|*-)<*1 + ^(l* + X* + l + l$ + X$ + l + l*-X*D- (5) 

o o 

This state, a 5/8 vs. 3/8 singlet-triplet mixture, can be produced by mixing 
equal amounts of singlets and random uncorrelated spins, or equivalently 
by sending one spin of an initially pure singlet through a 50% depolarizing 
channel. (A x-depolarizing channel is one in which a state is transmitted 
unaltered with probability 1 — x and is replaced with a completely random 
qubit with probability x.) These recipes suggest that E(Ws/$), the amount 
of pure entanglement required to prepare a Werner state, might be 0.5, but 
we show (Sec. ||) that in fact that E(Wz/%) « 0.117. The Werner state 
is also remarkable in that pure entanglement can be distilled from it 
by two-way protocols but not by any one-way protocol. In terms of noisy- 
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channel coding, this means that a 50% depolarizing channel, which has a 
positive capacity for transmitting classical information, has zero capacity for 
transmitting intact quantum states if used in a one-way fashion, even with 
the help of quantum error-correcting codes. This will be proved in Sec. (|. 
If the same channel is used in a two-way fashion, or with the help of two- 
way classical communication, it has a positive capacity due to the non-zero 
distillable entanglement D 2 (W5/s), which is known to lie between 0.00457 
and 0.117 pure singlets out per impure pair in. The lower bound is from an 
explicit 2-EPP, while the upper bound comes from the known entanglement 
of formation, which is always an upper bound on distillable entanglement. 

The remainder of this paper is organized as follows. Section |^ contains 
our results on the entanglement of formation of mixed states. Section |3] 
explains purification of pure, maximally entangled states from mixed states. 
Section [| exhibits a class of mixed states for which Di = but D 2 > 0. 
Section |5| shows the relationship between mixed states and quantum channels. 
Section || shows how a class of quantum error correction codes may be derived 
from one-way purification protocols and contains our efficient 5 qubit code. 
Finally, Sec. [7] reviews several important remaining open questions. 



2 Entanglement of Formation 
2.1 Justification of the Definition 

As noted above, we define the entanglement of formation E(M) of a mixed 
state M as the least expected entanglement of any ensemble of pure states 
realizing M. The point of this subsection is to show that the designation 
"entanglement of formation" is justified: in order for Alice and Bob to create 
the state M without transferring quantum states between them, they must 
already share the equivalent of E(M) pure singlets; moreover, if they do share 
this much entanglement already, then they will be able to create M. (Both 
of these statements are to be taken in the asymptotic sense explained in the 
Introduction.) In this sense E(M) is the amount of entanglement needed to 
create M. 

Consider any specific ensemble of pure states that realizes the mixed 
state M. By means of the asymptotically entanglement-conserving mapping 
between arbitrary pure states and singlets [|2D| , such an ensemble provides an 
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asymptotic recipe for locally preparing M from a number of singlets equal 
to the mean entanglement of the pure states in the ensemble. Clearly some 
ensembles are more economical than others. For example, the totally mixed 
state of two qubits can be prepared at zero cost, as an equal mixture of four 
product states, or at unit cost, as an equal mixture of the four Bell states. 
The quantity E(M) is the minimum cost in this sense. However, this fact 
does not yet justify calling E(M) the entanglement of formation, because one 
can imagine more complicated recipes for preparing M: Alice and Bob could 
conceivably start with an initial mixture whose expected entanglement is 
less than E(M) and somehow, by local actions and classical communication, 
transform it into another mixture with greater expected entanglement. We 
thus need to show that such entanglement-enhancing transformations are not 
possible. 

We start by summarizing the definitions that lead to E(M): 

Definition: The entanglement of formation of a bipartite pure state T 
is the von Neumann entropy E(Y) = S(Tta\T)(T\) of the reduced density 
matrix as seen by Alice or Bob (see Eq. 0). 

Definition: The entanglement of formation E(£) of an ensemble of bi- 
partite pure states £ = {pi,Ti} is the ensemble average J2iPiE(Ti) of the 
entanglements of formation of the pure states in the ensemble. 

Definition: The entanglement of formation E(M) of a bipartite mixed 
state M is the minimum of E(£) over ensembles £ = {pi, Tj} realizing the 
mixed state: M = J2i Pi\^ i\ 

We now prove that E(M) is nonincreasing under local operations and 
classical communication. First we prove two lemmas about the entanglement 
of bipartite pure states under local operations by one party, say Alice. Any 
such local action can be decomposed into four basic kinds of operation: (i) 
appending an ancillary system not entangled with Bob's part, (ii) performing 
a unitary transformation, (iii) performing an orthogonal measurement, and 
(iv) throwing away, i.e., tracing out, part of the system. (There is no need 
to add generalized measurements as a separate category, since such measure- 
ments can be constructed from operations of the above kinds.) It is clear that 
neither of the first two kinds of operation can change the entanglement of a 
pure state shared by Alice and Bob: the entanglement in these cases remains 
equal to the von Neumann entropy of Bob's part of the system. However, 
for the last two kinds of operation, the entanglement can change. In the fol- 
lowing two lemmas we show that the expected entanglement in these cases 
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cannot increase. 

Lemma: If a bipartite pure state T is subjected to a measurement by 
Alice, giving outcomes k with probabilities p k , and leaving residual bipartite 
pure states T k , then the expected entanglement of formation J2kPkE(Y k ) of 
the residual states is no greater than the entanglement of formation E(T) of 
the original state. 

J2PkE(T k )<E(T) (6) 

k 

Proof. Because the measurement is performed locally by Alice, it cannot 
affect the reduced density matrix seen by Bob. Therefore the reduced density 
matrix seen by Bob before measurement, p = Tr^|T)(T|, must equal the 
ensemble average of the reduced density matrices of the residual states after 
measurement: p k = Tr^|Tfc)(T^| after measurement. It is well known that 
von Neumann entropy, like classical Shannon entropy, is convex, in the sense 
that the entropy of a weighted mean of several density matrices is no less 
than the corresponding mean of their separate entropies Therefore 

S(p)>J2 Pk S( Pk ). (7) 

k 

But the left side of this expression is the original pure state's entanglement 
before measurement, while the right side is the expected entanglement of the 
residual pure states after measurement. 
□ 

Lemma: Consider a tripartite pure state T, in which the parts are la- 
beled A, B, and C. (We imagine Alice holding parts A and C and Bob 
holding part B.) Let M = Tr c |T)(T|. Then E{M) < E(T), where the latter 
is understood to be the entanglement between Bob's part B and Alice's part 
AC. That is, Alice cannot increase the minimum expected entanglement by 
throwing away system C. 

Proof. Again, whatever pure-state ensemble one takes as the realization of 
the mixed state M, the entropy at Bob's end of the average of these states 
must equal E(T), because the density matrix held by Bob has not changed. 
By the above argument, then, the average of the entropies of the reduced 
density matrices associated with these pure states cannot exceed the entropy 
of Bob's overall density matrix; that is, E(M) < E(T). 
□ 
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We now prove a theorem that extends both of the above results to mixed 
states: 

Theorem: If a bipartite mixed state M is subjected to an operation by 
Alice, giving outcomes k with probabilities p k , and leaving residual bipartite 
mixed states M k , then the expected entanglement of formation J2kPkE(M k ) 
of the residual states is no greater than the entanglement of formation E(M) 
of the original state. 

J2PkE(M k ) < E(M) (8) 
k 

(If the operation is simply throwing away part of Alice's system, then there 
will be only one value of k, with unit probability.) 

Proof. Given mixed state M there will exist some minimal-entanglement 
ensemble 

£ = {p„r,} (9) 

of pure states realizing M. 

For any ensemble £' realizing M, 

E(M) < E(S'). (10) 

Applying the above lemmas to each pure state in the minimal-entanglement 
ensemble £, we get, for each j, 

^2Pk\jE(M jk ) < E(Tj), (11) 

k 

where Mj k is the residual state if pure state Tj is subjected to Alice's oper- 
ation and yields result k, and p k \j is the conditional probability of obtaining 
this outcome when the initial state is Tj. 

Note that when the outcome k has occurred the residual mixed state is 
described by the density matrix 

M k = Y J P ] \kM jk . (12) 

j 

Multiplying Eq. (|TTD by pj and summing over j gives 

^2PjPk\jE{M jk ) < YsPjE^i) = E(M). (13) 

j,k j 
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By Bayes theorem, 
Eq. (p~3f ) becomes 



Pj,k = PjPk\j = PkPj\k, (14) 

J2PkPj\kE{M jk )<E{M). (15) 

Using the bound Eq. (|T0|) , we get 

J2PkE(M k ) < J2PkJ2Pi\k E ( M jk) < E(M). (16) 

k k j 

□ 

Although the above theorem concerns a single operation by Alice, it ev- 
idently applies to any finite preparation procedure, involving local actions 
and one- or two-way classical communication, because any such procedure 
can be expressed as sequence of operations of the above type, performed 
alternately by Alice and Bob. Each measurement-type operation, for exam- 
ple, generates a new classical result, and partitions the before-measurement 
mixed state into residual after-measurement mixed states whose mean en- 
tanglement of formation does not exceed the entanglement of formation of 
the mixed state before measurement. Hence we may summarize the result of 
this section by saying that expected entanglement of formation of a bipartite 
system's state does not increase under local operations and classical com- 
munication. As noted in [pOfl , entanglement itself can increase under local 
operations, even though its expectation cannot. Thus it is possible for Alice 
and Bob to gamble with entanglement, risking some of their initial supply 



with a chance of winning more than they originally had. 



2.2 Entanglement of Formation for Mixtures of Bell 
States 

In the previous subsection it was shown that an ensemble of pure states with 
minimum average pure-state entanglement realizing a given density matrix 
defines a maximally economical way of creating that density matrix. In 
general it is not known how to find such an ensemble of minimally entangled 
states for a given density matrix M. We have, however, found such minimal 
ensembles for a particular class of states of two spin-| particles, namely, 
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mixtures that are diagonal when written in the Bell basis Eqs. (fl]), (0), and 
(U). We have also found a lower bound on E(M) applicable to any mixed 
state of two spin-| particles. We present these results in this subsection. 



As a motivating example consider the Werner states of [pl| . A Werner 
state is a state drawn from an ensemble of F parts pure singlet, and (1— F)/3 
parts of each of the other Bell states — that is, a generalization of Eq. (|5]): 

w F = f|*-x*~| + ^-^(|^ + )(^ + | + |$ + X$ + | + l $ ~)($~D- (17) 

o 

This is equivalent to saying it is drawn from an ensemble of x = (4F — 1)/3 
parts pure singlet, and 1—x parts the totally mixed "garbage" density matrix 
(equal to the identity operator) 

G — I — -(|*+)(* + | + + + l$ _ X$ _ l), (18) 

which was Werner's original formulation. We label these generalized Werner 
states Wp, with their F value, which is their fidelity or purity 
relative to a perfect singlet (even though this fidelity is defined nonlocally, 
it can be computed from the results of local measurements, as 1 — 3P||/3, 
where Pn is the probability of obtaining parallel outcomes if the two spins 
are measured along the same random axis). 

It would take x = (4F— 1)/3 pure singlets to create a mixed state Wp by 
directly implementing Werner's ensemble. One might assume that this pre- 
scription is the one requiring the least entanglement, so that the W 5 /$ state 
would cost 0.5 ebits to prepare. However, through a numerical minimization 
technique we found four pure states, each having only 0.117 ebits of entangle- 
ment, that when mixed with equal probabilities create the W^/s mixed state 
much more economically. Below we derive an explicit minimally-entangled 
ensemble for any Bell-diagonal mixed state W, including the Werner states 
Wp as a special well as a giving a general lower bound for general 

mixed states M of a pair of spin-^ particles. For pure states and Bell-diagonal 
mixtures E(M) is simply equal to this bound. 

The lower bound is expressed in terms of a quantity f{M) which we call 
the "fully entangled fraction" of M and define as 

f(M) = max(e|M|e) , (19) 
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where the maximum is over all completely entangled states |e). Specifically, 
we will see that for all states of a pair of spin-^ particles, E(M) > h[f(M)], 
where the function h is defined by 

W/jJ^ + V^H)) for />i (20) 

V ' \ for / < f . V ' 

Here H(x) = — xlog 2 x — (1 — x) log 2 (l — x) is the binary entropy function. 
For mixtures of Bell states, the fully entangled fraction f(M) is simply the 
largest eigenvalue of M. 

We begin by considering the entanglement of a single pure state \<p). It 
is convenient to write \<p) in the following orthogonal basis of completely 
entangled states: 



|ei)= l<f> + > 

|e2> = i|*-> ( on 
|e 3 )=^+) {Zl) 

|e 4 )= |^-) 



Thus we write 



|$ = 5>il e i>- (22) 
i=i 

The entanglement of |0) can be computed directly as the von Neumann 
entropy of the reduced density matrix of either of the two particles. On 
doing this calculation, one finds that the entanglement of \<f>) is given by the 
simple formula 

E = H[\{1 + Vl^C*)], (23) 

where C = \ J2j (Note that one is squaring the complex numbers aj, not 
their moduli.) E and C both range from to 1, and E is a monotonically 
increasing function of C, so that C itself is a kind of measure of entangle- 
ment. According to Eq. (p3|), any real linear combination of the states \ej) is 
another completely entangled state (i.e., E = 1). In fact, every completely 
entangled state can be written, up to an overall phase factor, as a real linear 
combination of the |e_j)'s. (To see this, choose a\ to be real without loss of 
generality. Then if the other a/s are not all real, C will be less than unity, 
and thus so will E.) 

Note that if one of the a/s, say a%, is sufficiently large in magnitude, then 
the other a/s will not have enough combined weight to make C equal to zero, 
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e., 2\a± 
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-1 




2 (1- 






2 )] 



and thus the state will have to have some entanglement. This makes sense: if 
one particular completely entangled state is sufficiently strongly represented 
in then \<j>) itself must have some entanglement. Specifically, if |«i| 2 > |, 
then because the sum of the squares of the three remaining <x, 's cannot exceed 
1 — |«i| 2 in magnitude, C must be at least |o;i| 2 — (1 — 
It follows from Eq. (|23|) that E must be at least if [| + 
That is, we have shown that 

E(\<f>)) > Ml«i| 2 ), (24) 

where h is defined in Eq. (^). This inequality will be very important in 
what follows. 

As one might expect, the properties just described are not unique to the 
basis {|ej)}. Let \e'j) = J2k Rjk\ e k) , where R is any real, orthogonal matrix. 
(I.e., R T R = I.) We can expand |0) as \(f>) = J2j a 'j\ e 'j)i an d the sum J2j a j 2 
is guaranteed to be equal to J2j a j 2 because of the properties of orthogonal 
transformations. Thus one can use the components a' in Eq. ( p3|) just as well 



3 

as the components aj. In particular, the inequality (^) can be generalized 
by substituting for a\ the component of \<fi) along any completely entangled 
state |e). That is, if we define w = |(e|0)| 2 for some completely entangled 
|e), then 

> h(w). (25) 

We now move from pure states to mixed states. Consider an arbitrary 
mixed state M, and consider any ensemble £ = Pk,4>k which is a decomposi- 
tion of M into pure states 

M = J2Pk\<Pk)(<Pk\. (26) 

k 

For an arbitrary completely entangled state |e), let Wk = |(e|0fc)| 2 , and let 
w = (e|M|e) = J^kPkWk- We can bound the entanglement of the ensem- 
ble (BBT) as follows: 



E{E) = $> & £(|0 & » > E^K) ^ h 

k k 



. k 



h(w). (27) 



This equation is true in particular for the minimal entanglement ensemble 
realizing M for which E(M) = E(S). The second inequality follows from 
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the convexity of the function h. Clearly we obtain the best bound of this 
form by maximizing w = (e\M\e) over all completely entangled states |e). 
This maximum value of w is what we have called the fully entangled fraction 
f{M). We have thus proved that 

E(M) > h[f(M)\, (28) 

as promised. 

To make the bound (^8]) more useful, we give the following simple algo- 
rithm for finding the fully entangled fraction / of an arbitrary state M of a 
pair of qubits. First, write M in the basis {|ej)} defined in Eq. In this 

basis, the completely entangled states are represented by the real vectors, so 
we are looking for the maximum value of (e\M\e) over all real vectors |e). 
But this maximum value is simply the largest eigenvalue of the real part of 
M. We have then: / = the maximum eigenvalue of Re M, when M is written 
in the basis of Eq. (|2lD . 

We now show that the bound (^) is actually achieved for two cases of 
interest: (i) pure states and (ii) mixtures of Bell states. That is, in these 
cases, E{M) = h[f{M)\. 

(i) Pure states. Any pure state can be changed by local rotations into a 
state |£| of the form \4>) = a\ ]]) + f3\ ||), where a, (3 > and a 2 + (3 2 = 1. 
Entanglement is not changed by such rotations, so it is sufficient to show 
that the bound is achieved for states of this form. For M = |0)(</>|, the 
completely entangled state maximizing (e\M\e) is |$ + ), and the value of / is 
|($ + |0)| 2 = = | + af3. By straightforward substitution one finds that 

+ a(3) = H(a 2 ), which we know to be the entanglement of Thus 
E(M) = h[f(M)], which is what we wanted to show. 

(ii) Mixtures of Bell states. Consider a mixed state of the form 

W = X>i|ei><ei|. (29) 

3=1 

Suppose first that one of the eigenvalues pj is greater than or equal to |, and 
without loss of generality take this eigenvalue to be p\. The following eight 
pure states, mixed with equal probabilities, yield the state W: 

s/pi\ei) + i{±y/p2\e 2 ) ± v^l^) ± Vpa^a))- (30) 
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Moreover, all of these pure states have the same entanglement, namely, 

E = h( Pl ). (31) 

(See Eq. (|23|).) Therefore the average entanglement of the mixture is also 
(E) = h(p\). But pi is equal to f(W) for this density matrix, so for this 
particular mixture, we have (E) = h[f(W)]. Since the right hand side is our 
lower bound on E, this mixture must be a minimum-entanglement decom- 
position of W, and thus E(W) = h[f(W)}. 

If none of the eigenvalues pj is greater than |, then there exist phase 
factors Oi such that J2jPj el9j — 0. In that case we can express W as an equal 
mixture of a different set of eight states: 

y/p-xe i6ll2 \ex) ± ^/Ple m/2 \e 2 ) ± y/pie^ 2 \e 3 ) ± ^e^ 2 |e 4 ). (32) 

For each of these states, the quantity C [Eq. (f2~3|)l is equal to zero, and thus 
the entanglement is zero. It follows that E(W) = 0, so that again the bound 
is achieved. (The bound ^[/(W 7 )] is zero in this case because /, the greatest 
of the pj's, is less than |.) 

It is interesting to ask whether the bound h[f(M)] is in fact always equal 
to E(M) for general mixed states M, not necessarily Bell-diagonal. It turns 
out that it is not. Consider, for example, the mixed state 

M = \\ TTXTT I + §|* + >(* + |. (33) 

The value of / for this state is |, so that h(f) = 0. And yet, as we now 
show, it is impossible to build this state out of unentangled pure states; 
hence E(M) is greater than zero and is not equal to h(f). 

To see this, let us try to construct the density matrix of Eq. fl3"3"|) out of 
unentangled pure states. That is, we want 

M = J £Pk\<l>kK<t>k\, (34) 

k 

where each | is unentangled. That is, each |^) is such that when we write 
it in the basis of Eq. (|2~ID , i.e. as \4>k) = J2j=i a k,j\ e j), the a's satisfy the 
condition 

E«L' = °- (35) 
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Now the density matrix M, when written in the \ej) basis, looks like this: 



M 



i 
1 

— i 
1 





1 
1 

1 












1 

2 





(36) 



4|) to be true, the a's must be consistent with the 



Thus, in order for Eq. 
following conditions: 

E/cPfcK,il 2 : 
HkPkWkA 2 ■ 

T,kPk\a k A 2 : 

EfcPfcttfc,l a fc,2 

Evidently all the a^'s are equal to zero. By Eq. (|3~5|) 
satisfy 

|afc,i| 2 + l"fc,2| 2 > |afc,3| 2 for every k. 



i 
i 

f 

2 



i_ 
4' 



(37) 



the remaining ct's 



(38) 



In fact, the ">" of this last relation must be an equality, or else the sum 
conditions of Eq. ([37]) would not work out. That is, 



|«fc,l| 2 + \ a k,2\ 2 



\ a k,3\ 



for every k. 



(39) 



Combining this last equation with Eq. (p5|) , we arrive at the conclusion that 
for each k, the ratio of a^i to a^^ is real. But in that case there is no way 
to generate the imaginary sum required by the last of the conditions (|37|) . 
It is thus impossible to build M out of unentangled pure states; that is, 
E{M) > 0. 

We conclude, then, that our bound is only a bound and not an exact 
formula for E. It turns out, in fact, that there are two other ways to prove 
that the state M has nonzero entanglement of formation. Peres [^6| and 
Horodecki et al. |27j have recently developed a general test for nonzero en- 
tanglement for states of two qubits and has applied it explicitly to states like 
our M, showing that E(M) is nonzero. Also, in Sec. |3.2.2j below, we show 
that one can distill some pure entanglement from M, which would not be 
possible if E(M) were zero. 
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3 Purification 



Suppose Alice and Bob have n pairs of particles, each pair's state described 
by a density matrix M. Such a mixed state results if one or both members 
of an initially pure Bell state is subjected to noise during transmission or 
storage (cf. Fig. [l]). Given these n impure pairs, how many pure Bell singlets 
can they distill by local actions; indeed, can they distill any at all? In other 
words, how much entanglement can they "purify" out of their mixed state 
without further use of a quantum channel to share more entanglement? 

The complete answer is not yet known, but upper and lower bounds 
are |nj. An upper bound is E(M) per pair, because if Alice and Bob could 
get more good singlets than that they could use them to create more mixed 
states with density matrix M than the number with which they started 
thereby increasing their entanglement by local operations, which we have 
proven impossible (Sec. |2.1|) . Lower bounds are given by construction. We 
have found specific procedures which Alice and Bob can use to purify cer- 
tain types of mixed states into a lesser number of pure singlets. We call 
these schemes entanglement purification protocols (EPP), which should not 
be confused with the purifications of a mixed state of 



3.1 Purification Basics 

Our purification procedures all stem from a few simple ideas: 

1. A general two-particle mixed state M can be converted to a Werner 
state Wp (Eq. (|TTD) by an irreversible preprocessing operation which 
increases the entropy (S(Wp) > S(M)), perhaps wasting some of its 
recoverable entanglement, but rendering the state easier deal with be- 
cause it can thereafter be regarded as a classical mixture of the four 
orthogonal Bell states (Eqs. ([!]), ([|), and (^)) [^9|. The simplest such 



preprocessing operation, a random bilateral rotation\17_\ or "twirl", con- 
sists of choosing an independent, random SU(2) for each impure pair 
and applying it to both members of the pair (cf. Fig. |5|). Because of 
the singlet state's invariance under bilateral rotation, twirling has the 
effect of removing off-diagonal terms in the two-particle density matrix 
in the Bell basis, as well as equalizing the triplet eigenvalues. Actually, 
removing the off-diagonal terms is sufficient as all of our EPP protocols 
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operate successfully (with only minor modification) on a Bell-diagonal 
mixed state W with, in general, unequal triplet eigenvalues. Equal- 
ization of the triplet eigenvalues only adds unnecessary entropy to the 
mixture. In Appendix [X] it is shown that a continuum of rotations is 
unnecessary: an arbitrary mixed state of two qubits can be converted 
into a Werner Wp or Bell-diagonal W mixture by a "discrete twirl," 
consisting of a random choice among an appropriate discrete set of bi- 



lateral rotations [ 30 1 . We use T to denote the nonunitary operation of 



performing either a discrete or a continuous twirl. 




EPP7 



Figure 5: The general mixed state M of Fig. [3] can be converted into one of 
the Werner form Wf of Eq. flTTD if the particles on both Alice's and Bob's side 
are subjected to the same random rotation R (we refer to the act of choosing 
a random SU(2) rotation and applying it to both particles as a "twirl" T). 




2. Once the initial mixed state M has been rendered into Bell-diagonal 
form W, it can be purified as if it were a classical mixture of Bell states, 
without regard to the original mixed state M or the noisy channel(s) 
that may have generated it [|31|]. This is extremely convenient for the 
development of all our protocols. However, as we show in Appendix || 
all the purification protocols we will develop will also work just as well 
on the original non Bell-diagonal mixtures M. 

3. Bell states map onto one another under several kinds of local unitary 



23 



source 

ij/- $- $+ VJ/+ 

7 IF cp $+ ^+ 

Unilateral it Rotations: a x ^ + $ + 

(Ty $ + ^+ $~ 
cr 2 ^+ $+ $~ 



Bilateral 7r/2 Rotations: 



Bilateral XOR: 







source 










$" 


$+ 


^+ 








$- 


$+ 


^+ 




B x 




$- 


^+ 


$+ 




By 




y+ 


$+ 


$- 




B z 




$+ 


$- 


m+ 








source 






target 




$- 


$+ 










$+ 


$- 




(source) 




$- 








(target) 




^+ 


$+ 


$- 




(source) 


$- 




$- 


$- 




(target) 






$- 


$+ 


^+ 


(source) 


$+ 




$+ 


$+ 


^+ 


(target) 






$- 


$+ 


^+ 


(source) 


^+ 




^+ 


^+ 


$+ 


(target) 



Table 1: The unilateral and bilateral operations used by Alice and Bob to 
map Bell states to Bell states. Each entry of the BXOR table has two lines, 
the first showing what happens to the source state, the second showing what 
happens to the target state. 
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operations (cf. Table I]). These three sets of operations are of two types: 
unilateral operations which are performed by Bob or Alice but not both, 
and bilateral operations which can be written as a tensor product of 
an Alice part and a Bob part, each of which are the same. The three 
types of operations used are: 1) Unilateral rotations by 7r radians, 
corresponding to the three Pauli matrices a x , a y , and a z ; 2) Bilateral 
rotations by 7r/2 radians, henceforth denoted B x , B y , and B z \ and 3) 
The bilateral application of the two-bit quantum XOR (or controlled- 
NOT)[[3^, |33| hereafter referred to as the BXOR operation (see Fig. |6|). 
These operations and the Bell state mappings they implement, along 




Figure 6: The BXOR operation. A solid dot indicates the source bit of an 
XOR operation^] and a crossed circle indicates the target. In this example 
a state is the source and a $ + is the target. If the pairs are later brought 
back together and measured in the Bell basis the source will remain a 
and the target will have become a as per Table [I]. 

with individual particle measurements, are the basic tools Alice and 
Bob use to purify singlets out of W . 

4. Alice and Bob can distinguish <3> states from \l/ states by locally mea- 
suring their particles along the z direction. If they get the same results 
they have a $, if they get opposite results they have a Note that if 
only one of the observers (say Bob) needs to know whether the state 
was a $ or a the process can be done without two-way communica- 
tion. Alice simply makes her measurement and sends the result to Bob. 
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After Bob makes his measurement, he can then determine whether the 
state had been a $ or a \I/ by comparing his measurement result with 
Alice's, without any further communication. 

5. For convenience we take |<3> + ) as the standard state for the rest of the 
paper. This is because it is the state which, when used as both source 
and target in a BXOR, remains unchanged. It is not necessary to 
use this convention but it is algebraically simpler. We note that |$ + ) 
states can be converted to singlet (l^ - )) states using the unilateral 
<jy rotation, as shown in Table [I]. The only complication is that the 
nonunitary twirling operation T of item [l] works only when is 
taken as the standard state. But a modified twirl T' which leaves 
|$ + ) invariant and randomizes the other three Bell states may easily 
be constructed: the modified twirl would consist of a unilateral a y 
(which swaps the |$ + )'s and |\l/ _ )'s) followed by a conventional twirl 
T, followed by another unilateral a y (which swaps them back). 

6. The preceding points all suggest a new notation for the Bell states. We 
use two classical bits to label each of the Bell states and write 

$+ = 00 
^ + = 01 
$- = 10 

*" = 11. (40) 

The right, low-order or "amplitude" bit identifies the $/\& property 
of the Bell state, while the left, high-order or "phase" bit identifies 
the +/— property. Both properties could be distinguished simultane- 
ously by a nonlocal measurement, but local measurements can only 
distinguish one of the properties at a time, randomizing the other. For 
example a bilateral z spin measurement distinguishes the amplitude 
while randomizing the phase. 



3.2 Purification Protocols 

We now present several two- and one-way purification protocols. All begin 
with a large collection of n impure pairs each in mixed state M, use up n—m 
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Table 2: Probabilities for each initial configuration of source and target in 
a pair of Bell states drawn from the same ensemble, and the resulting state 
configuration after a BXOR operation is applied. The final column shows 
whether the target state passes (P) or fails (F) the test for being parallel 
along the z-axis (this is given by the rightmost bit of the target state after 
the BXOR). This table, ignoring the probability column, is just the BXOR 
table of Table [I] written in the bitwise notation of item |6] of Sec. pTT . 
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of them (by measurement), while maneuvering the remaining m pairs into 
a collective state M' whose fidelity ((<I> + ) m |M'|($ + ) m ) relative to a product 
of m standard $ + states approaches 1 in the limit of large n. The yield a 
purification protocol P on input mixed states M is defined as 

Dp(M) = lim m/n. (41) 

n— >oo 

If the original impure pairs M arise from sharing pure EPR pairs through 
a noisy channel \i then the yield Dp(M), defines the asymptotic number 
of qubits that can be reliably transmitted (via teleportation) per use of the 
channel. For one-way protocols the yield is equal to the rate of a correspond- 
ing quantum error-correcting code (cf. Section |5|). For two-way protocols, 
there is no corresponding quantum error-correcting code. We will compare 
the yields from our protocols with the rates of quantum error-correcting codes 
introduced by other authors, and with known upper bounds on the one-way 
and two-way distillable entanglements D\{W) and D 2 (W). These are defined 
in the obvious way, e.g. D\{W) = max{Dp(W) : P is a 1-EPP}. No entan- 
glement purification protocol has been proven optimal, but all give lower 
bounds on the amount of entanglement that can be distilled from various 
mixed states. 



3.2.1 Recurrence method 

A purification procedure presented originally in |T7| is the recurrence method. 



This is an explicitly two-way protocol. Two states are drawn from an ensem- 
ble which is a mixture of Bell states with probabilities p^ where % labels the 
Bell states in our two-bit notation. (As noted earlier, if the original impure 
state is not Bell-diagonal, it can be made so by twirling). The 00 state is 
again taken to be the standard state and we take poo = F. The two states are 
used as the source and target for the BXOR operation. Their initial states 
and probabilities, and states after the BXOR operation, are shown in Table [| 
Alice and Bob test the target states, and then separate the source states into 
the ones whose target states passed and the ones whose target state failed. 
Each of these subsets is a Bell state mixture, with new probabilities. These 
a posteriori probabilities for the 'passed' subset are: 

Poo = {pIo+P 2 io)/p pass Pqi — (.Pol Pll) I Ppass (42) 



Pio = 2pooPio/p P ass Pll = 2poiPn/p 



pass 
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with 

Ppass = Poo + Poi + Pio + Pii + 2 PooPio + 2poiPu- (43) 

Consider the situation where Alice and Bob begin with a large supply of 
Werner states Wp. They apply the preceding procedure and are left with a 
subset of states which passed and a subset which failed. For the members of 
the "passed" subset p' Q0 > poo for all poo > 0.5. The members of the "failed" 
subset have poo = Poi = Pw = Pn = 1/4. Since the entanglement E of this 
mixture is 0, it will clearly not be possible to extract any entanglement from 
the "failed" subset, so all members of this subset are discarded. Note that 
this is where the protocol explicitly requires two-way communication. Both 
Alice and Bob need to know the results of the test in order to determine 
which pairs to discard. 

The members of the "passed" subset have a greater p 00 than those in the 
original set of impure pairs. The new density matrix is still Bell diagonal, 



but is no longer a Werner state Wp. Therefore, a twirl T' is applied (Sec. \$A_ 
items m and |^), leaving the poo component alone and equalizing the others @ . 
(It is appropriate in this situation to use the modified twirl X" which leaves $ + 
invariant, as explained in item [5] of Sec. |3.1| ) We are left with a new situation 
similar the the starting situation, but with a higher fidelity F' = p' O0 . Figure |7] 
shows the resulting F' versus F. The process is then repeated; iterating 
the function of Fig. [7] will continue to improve the fidelity. This can be 
continued until the fidelity is arbitrarily close to 1. C. Macchiavello has 



found that faster convergence can be achieved by substituting a deterministic 
bilateral B x rotation for the twirl T'. With this modification, the density 
matrix remains Bell-diagonal, but no longer has the Werner form Wp after 
the first iteration; nevertheless its poo component increases more rapidly with 
successive iterations. 

Even with this improvement the recurrence method is rather inefficient, 
approaching zero yield in the limit of high output fidelity, since in each 
iteration at least half the pairs are lost (one out of every two is measured, 
and the failures are discarded). Figure |7] shows the fraction of pairs lost 
on each iteration. A positive yield, D2, even in the limit of perfect output 
fidelity can be obtained by switching over from the recurrence method to 
the hashing method, to be described in Section |3.2.3j , as soon as so doing 



will produce more good singlets than doing another step of recurrence. The 
yield versus initial fidelity of these combined recurrence-hashing protocols is 
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Figure 7: Effect on the fidelity of Werner states of one step of purification, 
using the recurrence protocol. F is the initial fidelity of the Werner state 
(Eq. (^)), F' is the final fidelity of the "passed" pairs after one iteration. 
Also shown is the fraction p paS s/2 of pairs remaining after one iteration (cf. 
Eq. (|D). 
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Figure 8: Measures of entanglement versus fidelity F for Werner states Wp 
of Eq. (0). E is the entanglement of formation, Eq. fl27|) . Dr is the yield 
of the recurrence method of Sec. |3.2.1| continued by the hashing method 
of (Sec. |3.2.4j) . Dm is the yield of the modified recurrence method of C. 
Maccbiavello [P4|] , continued by hashing. Dh is the yield of the one-way hash- 
ing and breeding protocols (Sec. |3.2.4|) used alone. Dcs is the rate of the 
quantum error correcting codes proposed by Calderbank and Shor|l0| and 
Steane[|II[]. B KL is the upper bound for Di as shown in Sec. |6l^ (following 
Knill and Laflamme [H0|). 
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Figure 9: The same as Fig. |8] exhibited on logarithmic scales. The value 
along the x-axis is proportional to the logarithm of (F — 0.5). In this form 
it is clear that E, Dm and Dr follow power laws (F — .5) a . The ripples in 
Dm and Dr are real, and arise from the variable number of recurrence steps 
performed before switching over to the hashing protocol ||T7|. 
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shown in Figure |8|. 

It is important to note that the recurrence-hashing method gives a pos- 
itive yield of purified singlets from all Werner states with fidelity greater 
than 1/2. Werner states of fidelity 1/2 or less have E = and therefore can 
yield no singlets. The pure hashing and breeding protocols, described below, 
which are one-way protocols, work only down to F ~ .8107, and even the 
best known one-way protocol [[35 works only down to F ~ .8096. 



3.2.2 Direct purification of non-Bell-diagonal mixtures 

Most of the purification strategies discussed in this paper assume that the 
state to be purified is first brought to the Werner form, or at least to Bell- 
diagonal form, by means of a twirling operation. As we have said, though, 
this strategy is somewhat wasteful and we use it only to make the analysis 
manageable. In this subsection we give a simple example showing how a 
state can be purified directly with no twirling. For this particular example, 
it happens that the purification is accomplished in a single step rather than 
in a series of steps that gradually raise the fidelity. 
Consider again the state M of Eq. (|33|) : 



M= || TTXTT I + ||^ + )(^ + |. (44) 



Note that because the fully-entangled fraction (Eq. (|T9|) ) / = 1/2 for this 
state, it cannot be purified by the recurrence method. However, a collection of 
pairs in this state can be purified as using the following two-way protocol 



as in the recurrence method, Alice and Bob first perform the BXOR operation 
between pairs of pairs, and then bilaterally measure each target pair in the 
up-down basis. One can show that if the outcome of this measurement on 
a given target pair is "down-down," then the corresponding source pair is 
left in the completely entangled state Alice and Bob therefore keep the 
source pair only when they get this outcome, and discard it otherwise. The 
probability of getting the outcome "down-down" is |, and since each target 
pair had to be sacrificed for the measurement, the yield from this procedure 
is D 2 = jg. The same strategy works for any state of the form 

M=(i-p)|tT)(TT|+p|^ + )(^ + |, (45) 

with yield D 2 = p 2 /4:. 
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A recent paper by Horodecki et al. [0 presents a more general approach 
to the purification of mixed states which, like the above scheme, does not 
start by bringing the states to Bell-diagonal form. Their strategy begins 
with a filtering operation aimed at increasing the fully entangled fraction 
/ (Eq. (|19D) of the surviving pairs; these pairs are then subjected to the 
recurrence procedure described above. These authors have shown that by 
this technique, one can distill some amount of pure entanglement from any 
state of two qubits having a nonzero entanglement of formation. In other 
words, they have obtained for such systems the very interesting result that 
if E(M) is nonzero, then so is D2(M). 

3.2.3 One-way hashing method 

This protocol uses methods analogous to those of universal hashing in classi- 
cal privacy amplification . (We will give a self-contained treatment of this 
hashing scheme here.) Given a large number n of impure pairs drawn from 
a Bell-diagonal ensemble of known density matrix W, this protocol allows 
Alice and Bob to distill a smaller number m ~ n(l — S(W)) of purified pairs 
(e.g. near-perfect $ + states) whenever S(W) < 1. In the limit of large n, 
the output pairs approach perfect purity, while the asymptotic yield m/n 
approaches 1 — S(W). This hashing protocol supersedes our earlier breeding 
protocol JT7 ], which we will review briefly in Sec. |3.2.4j . 



The hashing protocol works by having Alice and Bob each perform BX- 
ORs and other local unitary operations (Table [I]) on corresponding members 
of their pairs, after which they locally measure some of the pairs to gain in- 
formation about the Bell states of the remaining unmeasured pairs. By the 
correct choice of local operations, each measurement can be made to reveal 
almost 1 bit about the unmeasured pairs; therefore, by sacrificing slightly 
more than nS(W) pairs, where S(W) is the von Neumann entropy (See Eq. 
@) of the impure pairs, the Bell states of all the remaining unmeasured 
pairs can, with high probability, be ascertained. Then local unilateral Pauli 
rotations (cr x , y ,z) can be used to restore each unmeasured pair to the standard 
$ + state. 

The hashing protocol requires only one-way communication: after Alice 
finishes her part of the protocol, in the process having measured n—m of her 
qubits, she is able to send Bob classical information which, when combined 
with his measurement results, enables him to transform his corresponding 
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unmeasured qubits into near-perfect <3> + twins of Alice's unmeasured qubits, 
as shown in Fig. [| 

Let 5 be a small positive parameter that will later be allowed to approach 
zero in the limit of large n. The initial sequence of n impure pairs can be 
conveniently represented by a 2n-bit string xq formed by concatenating the 
two-bit representations (Eq. fl40|)) of the Bell states of the individual pairs, 
the sequence for example being represented 110010. The parity 

of a bit string is the modulo-2 sum of its bits; the parity of a subset s of 
the bits in a string x can be expressed as a Boolean inner product s ■ x, 
i.e. the modulo-2 sum of the bitwise AND of strings s and x. For example 
1101-0111 = 0in accord with the fact that there are an even number of 
ones in the subset consisting of the first, second and fourth bit of the string 
0111. Although the inner product s • x is a symmetric function of its two 
arguments, we use a slanted font for the first argument to emphasize its role 
as a subset selection index, while the second argument (in Roman font) is 
the bit string representing an unknown sequence of Bell states to be purified. 

The hashing protocol takes advantage of the following facts: 

• the distribution P Xo of initial sequences x , being a product of n iden- 
tical independent distributions, receives almost all its weight from a set 
of m 2 nS ( w ^ "likely" strings. If the likely set £ is defined as comprising 
the 2 n ( s '( l4/ ) +<5 ) most probable strings in Px , then the probability that 
the initial string x falls outside C is 0(exp(— 5 2 ra))0. 

• As will be described in more detail later, the local Bell-preserving uni- 
tary operations of Table [I] (bilateral tt/2 rotations, unilateral Pauli 
rotations, and BXORs), followed by local measurement of one of the 
pairs, can be used to learn the parity of an arbitrary subset s of the bits 
in the unknown Bell-state sequence x, leaving the remaining unmea- 
sured pairs in definite Bell states characterized by a two-bits-shorter 
string f s (x) determined by the initial sequence x and the chosen subset 
s. 

• For any two distinct strings x ^ y, the probability that they agree on 
the parity of a random subset of their bit positions, i.e., that s-x = s-y 
for random s, is exactly 1/2. This is an elementary consequence of the 
distributive law (s • x) © (s • y) — s ■ (x © y). 
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The hashing protocol consists of n—m rounds of the following procedure. 
At the beginning of the (k + l)'st round, k = 0, l...n — m — 1, Alice and 
Bob have n — k impure pairs whose unknown Bell state is described by a 
2(n — fc)-bit string Xk- In particular, before the first round, the Bell sequence 
Xq is distributed according to the simple a priori probability distribution 
Px noted above. Then in the (k + l)'st round, Alice first chooses and tells 
Bob a random 2{n — fc)-bit string s&. Second, Alice and Bob perform local 
unitary operations and measure one pair to determine the subset parity Sk-Xk, 
leaving behind n — k — 1 unmeasured pairs in a Bell state described by the 
(2(n - k) - 2)-bit string x k+1 = f Sk {x k ). 

Consider the trajectories of two arbitrary but distinct strings Xq ^ y$ un- 
der this procedure. Let x k and y k denote the images of x$ and yo respectively 
after k rounds, where the same sequence of operations f so ,f sl ---fs n - m - 1 , pa- 
rameterized by the same random-subset index strings s , Si...s n _ m _i, is used 
for both trajectories. It can readily be verified that for any r < n the prob- 
ability 

P((x r ^ y r ) & VX=o( s fc • x k = s k • Vk)) (46) 

(i.e., the probability that x r and y r remain distinct while nevertheless having 
agreed on all r subset parities along the way, Sk-Xk = Sk-yk for k = 0...r — 1) is 
at most 2~ r . This follows from the fact that at each iteration the probability 
that x and y remain distinct is < 1, while the probability that, if they 
were distinct at the beginning of the iteration they will give the same subset 
parity, is exactly 1/2. Recalling that the likely set C of initial candidates has 
only 2 n ( 5 '( w/ )+ <5 ) members, but with probability greater than 1 — 0(exp(— 5 2 n)) 
includes the true initial sequence xo, it is evident that after r = n—m rounds, 
the probability of failure, i.e. of no candidate, or of more than one candidate, 
remaining at the end for x m , is at most 2 n ( s ( w/ ) +<5 )~ (n ~ m ) + 0(exp(— 5 2 n)). 
Here the first term upper-bounds the probability of more than one candidate 
surviving, while the second term upper-bounds the probability of the true Xq 
having fallen outside the likely set. Letting n—m = n(S(M) + 28) and taking 
5 ps n -1 / 4 , we get the desired result, that the error probability approaches 
and the yield m approaches n(l — S(M)) in the limit of large n. 

It remains to show how the local operations of Table |l| can be used to 
collect the parity of an arbitrary subset of bits of x into the amplitude bit 
of a single pair. We choose as the destination pair, into which we wish to 
collect the parity s ■ x, that pair corresponding to the first nonzero bit of s. 
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For example if s = 00, 11,01, 10 (see Fig. [T0| ) , the destination will be the 



second pair of Xk- Our goal will be to make the amplitude bit of that pair 
after round k equal to the parity of: both bits of the second pair, the right 
bit of the third pair, and the left bit of the fourth pair in the unknown input 
Xk- Pairs such as the first, having 00 in the index string s, have no effect on 
the desired subset parity, and accordingly are bypassed by all the operations 
described below. 

The first step in collecting the parity is to operate separately on each of 
the pairs having a 01, 10, or 11 in the index string, so as to collect the desired 
parity for that pair into the amplitude (right) bit of the pair. This can be 
achieved by doing nothing to pairs having 01 in the index string, performing 
a By on pairs having 10 (since B y has the effect of interchanging the phase 
and amplitude bits of a Bell state), and performing the two rotations B x 
and o~ x on pairs with J J in the index string (B x a x = o~ x B x has the effect of 
XORing a Bell state's phase bit into its amplitude bit). 

The next step consists of BXORing all the pairs except those with 00 in 
the index string into the selected destination, in this case the second pair. 
The selected destination pair is used as the common target for all these 
BXORs, causing its amplitude bit to accumulate the desired subset parity 
s ■ x. This follows from the fact (cf. Table [l]) that the BXOR leaves the 
source's amplitude bit unaffected while causing the target's amplitude bit 
to become the XOR of the previous amplitude bits of source and target. 
Recall that phase bits behave oppositely under BXOR: the target's phase bit 
is unaffected while the source's phase bit becomes the XOR of the previous 
values of source and target phase bits; this "back-action" must be accounted 
for in determining the function / s . Figure [H] illustrates this step of the 
hashing method on an unknown 4-Bell-state sequence x using the subset 
index string s = 00, 11,01, 10 mentioned before. 

The hashing protocol distills a yield Dh = 1 — S(W), which we have 
called D in our previous work[[T7|]. For the Werner channel, parameterized 
completely by F, 

S(W F ) = -F\og 2 (F) - (l-F)log 2 ((l-F)/3), (47) 

giving a positive yield for Werner states with F > 0.8107. Figures ^] and |9] 
show Dh{F), comparing it with E and with other purification protocols. 
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Figure 10: Step k of the one-way hashing protocol, used to determine 
the parity sj, ■ x^, for an arbitrary unknown set of four Bell states repre- 
sented by an unknown 8-bit string x relative to a known subset index string 
s = 00, 11, 01, 10. If bilateral measurement M. yields a \I> state (i.e. if the 
measurement result is 1), then half the candidates for x are excluded (e.g. 
:r=00,00,00,00), but half are still allowed (e.g. x=00,ll,00,00). For each 
allowed x, the after-measurement Bell states of the three remaining unmea- 
sured pairs are a described by a 6-bit sequence Xk+i = f s (%k) deterministically 
computable from x and s. 
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3.2.4 Breeding method 



This protocol, introduced in Ref. |P!7[ , will not be described here in detail, 
as it has been superseded by the one-way hashing protocol described in the 
preceding section. The breeding protocol assumes that Alice and Bob have 
a shared pool of pure |$ + ) = 00 states, previously prepared by some other 
method (e.g. the recurrence method) and also a supply of Bell-diagonal 
impure states which they wish to purify. The protocol consumes the $ + 
states from the pool, but, if the impure states are not too impure, produces 
more newly purified pairs than the number of pool states consumed (in the 
manner of a breeder reactor). 

The basic step of breeding is very similar to that of hashing and is shown 
in Fig. [n]. Again a random subset s of the amplitude and phase bits of the 
Bell states is selected. The parity of this selected set is again gathered up 
in exactly the same way, except that the target of the BXOR operations is 
one of the pre-purified 00 states. The use of the pure target simplifies the 
action of the BXOR, in that the "back action" which changes the state of the 
source bits is avoided in this scheme. This means that the input string x can 
be restored to exactly its original value by a simple undoing of the one-qubit 
local operations, as shown, This offers the advantage that the (possibly very 
complicated) sequence of boolean functions f so ,fs 1 ---fs n - m - 1 do not have to 
be calculated in this case. Once again, the result of the parity measurement 
A4 is to reduce the number of candidates for x by almost exactly 1/2. Thus, 
by the same argument as before, after n — m « nS(W) rounds of parity 
measurements, it is probable that x has been narrowed down to be just one 
member of the likely set C Thus, all n of these pairs can be turned into pure 
$ + states; however, since n—m pure $ +, s have been used up in the process, 
the net yield is m/n = Dh{F), exactly the same as in the hashing protocol. 



4 One-way D and two-way D are provably 
different 

It has already been noted that some of the entanglement purification schemes 
use two-way communication between the two parties Alice and Bob while 
others use only one-way communication. The difference is significant because 
one-way protocols can be used to protect quantum states during storage in a 
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Figure 11: Step k of the one-way breeding protocol. The scheme is very 
similar to the hashing protocol of Fig. |10], except that the target for the 
BXORs is guaranteed to be a perfect $ + state. This allows the one-bit 
operations to be undone so that there is no back-action on the string x. 
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noisy environment, as well as during transmission through a noisy channel, 
while two-way protocols can only be used for the latter purpose (cf. Section 
|^). Thus it is important to know whether there are mixed states for which 
Di is properly less than D 2 - Here we show that there are, and indeed that 
the original Werner state W 5 /$, (i.e., the result of sharing singlets through 
a 50% depolarizing channel) cannot be purified at all by one-way protocols, 
even though it has a positive yield under two-way protocols. 

To show this, consider an ensemble where a state-preparer gives Alice n 
singlets, half shared with Bob and half shared with another person (Charlie). 
Alice is unaware of which pairs are shared with Bob and which with Charlie. 
Bob and Charlie are also given enough extra garbage particles (either ran- 
domly selected qubits or any state totally entangled with the environment 
but with no one else) so that they each have a total of n particles as well. 
This situation is diagrammed in Fig. |T2[ From Alice and Bob's point of view, 



Alice 




Bob 



Charlie 



Figure 12: A symmetric situation in which Bob and Charlie are each equally 
entangled with Alice. Two-headed arrows denote maximally-entangled pairs, 
and open circles denote garbage states (Eq. (|TS|)). 
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each state has the density matrix 

Alice, without hearing any information from Bob or Charlie, is supposed 
to do her half of a purification protocol and then send on classical data to the 
others. Therefore, each particle Alice has looks like a totally mixed state to 
her. By symmetry, anything she could do to assure herself that a particular 
particle is half of a good EPR pair shared with Bob will also assure her 
that the same particle is half of a good EPR pair shared with Charlie. No 
such three-sided EPR pair can exist. If she used it to teleport a qubit to 
Bob she would also have teleported it to Charlie, violating the no-cloning 
theorem ||39|| . Therefore, she cannot distill even one good EPR pair from 
an arbitrarily large supply of W§/ 8 states. On the other hand the combined 
recurrence-hashing method {Dm in Fig. |^) gives a positive lower bound on 
the two-way yield D 2 (W 5 / 8 ) > 0.00457 so we can write 

D 1 (W 5/8 ) = < 0.00457 < D 2 (W 5/8 ). (48) 

It is also clear that any ensemble of Werner states can be reduced to one 
of lower fidelity by local action (combining with totally mixed states of 
Eq. ([ID). Therefore D 1 {W F ) = for all F < 5/8. Knill and Laflamme 
prove IIJ that Di(W F ) = for all F < 3/4. In Sec. |6J| we explain their 
proof and, using the argument of Sec. [Oj obtain the bound 



D x < 4/ - 3 , (49) 

as shown in Figs. |8| and |9]. 

A similar argument can be used to show that for some ensembles Di 
is not symmetric depending on whether it is Alice or Bob who starts the 



communication. Suppose in the symmetric situation of Fig. [12] that Bob 
and Charlie know which pairs are shared with Alice and which are garbage. 
For this ensemble the symmetry argument for Alice remains the same and 
D a-^b — 0. If the communication is from Bob to Alice, though, it is easy to 
see he can use half of his particles, the ones he knows are good pairs shared 
with Alice. The other half are useless since they have E = and could have 
been manufactured locally. Thus we have Db^a = 1/2 and Da^b = 0. 

Our no-cloning argument shows that Alice and Bob cannot generate good 
EPR pairs by applying a 1-EPP to the mixed state W 5 / 8 generated by sharing 
singlets through a 50% depolarizing channel. As a consequence, there is no 
quantum error- correcting code which can transmit unknown quantum states 
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reliably through a 50% depolarizing channel, as will be shown in the next 
section. 



5 Noisy Channels and Bipartite Mixed States 

In preceding sections we have considered the preparation and purification 
of bipartite mixed states, and we have shown that two-way entanglement 
purification protocols can purify some mixed states that cannot be puri- 
fied by any one-way protocol. When used in conjunction with teleporta- 
tion, purification protocols, whether one-way or two-way, offer a means of 
transmitting quantum information faithfully via noisy channels; and one- 
way protocols, by producing time-separated entanglement, can addition- 
ally be used to protect quantum states during storage in a noisy environ- 
ment. In this section we discuss the close relation between one-way entan- 
glement purification protocols and the other well-known means of protect- 
ing quantum information from noise, namely quantum error- correcting codes 



(QECC) [| |, 0, 0, [IJ, |IJ, 

A quantum channel Xi operating on states in an iV-dimensional Hilbert 
space, may be defined as (cf. 0) a unitary interaction of the input state 
with an environment, in which the environment is supplied in a standard pure 
initial state |0) and is traced out (i.e. discarded) after the interaction to yield 
the channel output, generally a mixed state. The quantum capacity Q(x) °f 
such a channel is the maximum asymptotic rate of reliable transmission of 
unknown quantum states |£) in Ti, 2 through the channel that can be achieved 
by using a QECC to encode the states before transmission and decode them 
afterward. 

As in quantum teleportation H we will also consider the possibility that 
the quantum channel is supplemented with classical communication. This 
leads us to define the augmented quantum capacities Q\{x) an d <?2(x)? °f a 
channel supplemented by unlimited one- and two-way classical communica- 
tion. For example, Fig. shows a quantum error-correcting code, consist- 
ing of encoding transformation U e and decoding transformation Ud, used to 
transmit unknown quantum states |£) reliably through the noisy quantum 
channel Xi with the help of a one-way classical side channel (operating in the 
same direction as the quantum channel). Perhaps surprisingly, this one-way 
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classical channel provides no enhancement of quantum capacity: 

Qi = Q . 



(50) 



This will be shown in Sec. 5.1 




Figure 13: A general one-way QECC. A classical side-channel from Alice to 
Bob is allowed in addition to quantum channel x- 

We consider also the case of a noisy quantum channel supplemented by a 
noiseless quantum channel. We will show in Sec. |5.2| that the capacity of n 



uses of a noisy channel supplemented by m uses of a noiseless channel of unit 
capacity is no greater than the sum of their individual capacities, i.e. their 
quantum capacities are no more than additive. We have no similar result for 
the case of two different imperfect channels. 

In contrast to Eq. (pOD we will show that for may quantum channels 
two way classical communication can be used to transmit quantum states 
through the channel at a rate ^(x) considerably exceeding the one-way 
capacity Q(x)- This is typically done by using the channel to share EPR 
pairs between Alice and Bob, purifying the resulting bipartite mixed states 
by a two-way entanglement purification protocol, then using the resulting 
purified pairs to teleport unknown quantum states |£) from Alice to Bob. 

The analysis of Q and Q 2 is considerably simplified by the fact that 
an important class of noisy channels, including depolarizing channels, can 
be mapped in a one-to-one fashion onto a corresponding class of bipartite 
mixed states, with the consequence that the channel's quantum capacity 
Qi = Q is given by the one-way distillable entanglement D\ of the mixed 
state, and vice versa. For example, a depolarizing channel of depolarization 
probability p = 1 — x (cf. Eq. fll8D ) corresponds to a Werner state Wp of 
fidelity F = 1 - (3p/4) and has Q = D l {W F ) and Q 2 = D 2 (W F ). 
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The correspondence between channels and mixed states is established by 
two functions, M(x) defining the bipartite mixed state obtained from channel 
X and x(M) defining the channel obtained from bipartite mixed state M. The 
bipartite mixed state M(x) * s obtained by preparing a standard maximally 
entangled state of two iV-state subsystems, 



N 



T = N~ 1 / 2 Y / \e 3 )®\e J ) (51) 



i=l 



and transmitting Bob's part through the channel x- F° r example a Werner 
state Wf, with F = 1 — 3p/4 results when half a standard EPR pair is 
transmitted through a p-depolarizing channel. 

The mapping in the other direction, from mixed states to channels, is 
obtained by teleportation. Given a bipartite mixed state M of two subsys- 
tems, each having Hilbert space of dimension N, the channel x(M) is defined 
by using mixed state M, instead of the standard maximally entangled state 
|T)(T|, in a teleportation || channel (see Fig. f|). It can be readily shown 
that for Bell-diagonal mixed states the two mappings are mutually inverse 
M(x{M)) = M; we shall call the channels corresponding to such mixed 
states "generalized depolarizing channels" . 

For more general channels and mixed states, the two mappings are not 
generally mutually inverse. For example, x(M), for the bipartite state M = 
| TTXTT |, is the p = 1 depolarizing channel, and M(x(M)) = G of Eq. (|18|). 

Nevertheless, two quite general inequalities will be demonstrated in Sec- 

Vm D 1 (M)>Q(x(M)) (52) 



tions |0] and 
and 



V x D 1 (M(x)) < Q(x)- (53) 

If (as in the case of a Bell diagonal state and its corresponding generalized 
depolarizing channel) the mapping is reversible, so that M = M(x) and 
X = x(M), the two inequalities are both satisfied, resulting in the equality 
mentioned earlier, viz. 

D x (M) = Q(x). (54) 



Equation ( p2|) follows from the ability, to be demonstrated in the Sec. pT3| , to 
transform a QECC on x(M) into a 1-EPP on M; Eq. (p3|) follows, as shown 
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in Sec. |5.4| , from the fact that any 1-EPP on M(x), followed by quantum 
teleportation, results in a QECC on x with a classical side channel. 

A trivial extension of these arguments also shows that the corresponding 
results for two-way classical communication are true, namely: 

V M D 2 (M) > Q 2 (x(M)) (55) 

and 

V x D 2 (M( x ))<Q 2 (x) , (56) 
and if M(x{M)) = M then 

D 2 (M) = Q 2 ( X ). (57) 

5.1 A forward classical side channel does not increase 
quantum capacity 

To demonstrate Eq. (0), we note that any one-way protocol for transmit- 
ting |£) through channel x can be described as in Fig [13]. The sender Alice 
codes |£) and an ancillary state |0) using unitary transformation U e . She 
then performs an incomplete measurement on the coded system giving clas- 
sical results r which she sends on to Bob, the receiver, (if r contains any 
information about the quantum input |£) the strong no-cloning theorem JIT 



would prevent the original state from being recovered perfectly, even if the 
channel were noiseless. However, r might contain information on how the 
input |£) is coded.) She also sends the remaining quantum state through x 
as encoded state \( r ). The channel maps \( r ) onto \rj r i) for a noise syndrome 
i. 

Consider the unitary transformation Bob uses for decoding in the case of 
some value of the classical data r for for which the decoding is successful and 
without loss of generality name this case r = 0. (For a code which corrects 
with asymtotically perfect fidelity there may be some cases of r for which 
the correction doesn't work.) We also consider error syndrome i which is 
successfully corrected by Ud- We have 

U d (r = 0)(\r tQi )®\0)) = \£)®\<k) ■ (58) 

(For our choice of i the final |eij) state can without loss of generality be taken 
to be |0) in an appropriately sized Hilbert space.) Applying U d 1 (r = 0) gives 

^" 1 (r = 0)(|e>®|0» = |^)®|0> . (59) 
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There must exist another unitary operation U s which rotates \rjoi) into the 
noiseless coded vector Thus, 

U s U d \r = 0)(\O ®\0)) = \(o) ®\0) . (60) 

In other words, UgU^ir = 0) takes |£) into | Co) along with some ancillary 
inputs and outputs always in a standard |0) state. Therefore U s Uj l {r = 0) is 
a good encoder. Since this encoder always results in the correct code vector 
corresponding to classical data r = this data need not be sent to Bob at 
all, as he will have anticipated it. Thus, U s U d ~ 1 (r = 0) and Ud form a code 
needing no classical side-channel. 

It may happen that for a large block code which only error-corrects to 
some high fidelity > 1 — e where is the final output of the 

decoder) that no case is corrected perfectly. Then the coded states produced 
by U a Ufi {t = 0) will be imperfect. After transmission through the noisy 
channel and correction by Ud the final output will then be less perfect than 
in the original code. Nevertheless, because of unitarity it is clear that as 
e — > the fidelity of this code will also approach unity. 

Thus any protocol using classical one-way data transmission to supple- 
ment a quantum channel can be converted into a protocol in which the clas- 
sical transmission is unnecessary and with the same capacity Q — Q\. We 
have also now shown that the encoding stage is unitary, in the sense that no 
extra classical or quantum results accumulate in Alice's lab. 

If the error syndrome % = 0, corresponding to no error, is decoded with 
high fidelity by Ud then U s can be taken to be the identity. Thus, the encoding 
and decoding transformations can in this case be written in a form where 
U e = UJ , a fact independently shown by Knill and Laflamme [40|. If the i = 



error syndrome is not decoded with high fidelity by Ud [f|2] then the encoder 
cannot be the inverse of the decoder. The proof is simple: U e (\^) ® |0)) = \() 
(where we have dropped the r subscripts since it has been proven the classical 
data is never needed) and therefore t/ e _1 |C) = (10 ® |0)). Thus U~ x decodes 
the noiseless coded vectors \() which is exactly what Ud has been assumed 
not to do. 
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5.2 Additivity of perfect and imperfect quantum chan- 
nel capacities 

Consider a channel of capacity Q > supplemented by a perfect channel of 
capacity 1. Suppose the imperfect channel is used n times and the perfect 
channel is used m times. We will call the maximum number of bits transmit- 
ted through the channels in this case T. If the capacity of this joint channel 
is additive then T = T a = Qn + m. 

Suppose the number of bits transmitted is superadditive, i.e. T > T a . 
From the definition of noisy channel capacity we know that we can use an 
imperfect channel t times to simulate a perfect channel being used m times 
where Qt = m. We now use the imperfect channel a total n + 1 times and we 
can transmit T qubits through this two-part use of the imperfect channel. 
But T > T a = Qn + m so 

T>Qn + Qt . (61) 
The capacity of this channel is Q' = Using Eq.flB"T|) we can write 

t > Q 1 ±qt = (62) 

n + t n+t 

A capacity of Q' > Q has been achieved using only the original imperfect 
channel whose capacity was Q. This cannot be so. 

5.3 QECC -> 1-EPP proving V Af D X (M) > Q{x{M)) 



To demonstrate this inequality (cf. Fig. 13) we use bipartite mixed states M 
in place of the standard maximally entangled states ($ + ) to teleport n qubits 
from Alice to Bob. This teleportation defines a certain noisy channel x(M), 
so designated on the center right of the figure. Alice prepares n qubits to be 
teleported through this channel by applying the encoding transformation U e 
of a QECC to m halves of EPR pairs which she generates in her lab (upper 
left) at I and to n — m ancillas in the standard |0) state. The resulting 
quantum-encoded n qubits are teleported to Bob at lower right through the 
noisy channel. There Bob applies the decoding transformation If the 
code can successfully correct the errors introduced by the noisy teleportation, 
then the result is that Alice and Bob share m time-separated EPR pairs (*). 
Indeed the whole figure can be regarded as a one-way purification protocol 
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whereby Alice and Bob prepare m good EPR pairs from n of the initial mixed 
states M, using a QECC of rate Q = m/n able to correct errors in the noisy 
quantum channel \{M). Thus Di(M) must be at least as great as the rate 
Q(x(M)) of the best QECC able to achieve reliable quantum transmission 
through x(M). 




EPP13 



Figure 14: A QECC can be transformed into a 1-EPP. Teleporting (M 4 , {7 4 ) 
via a mixed state M defines the noisy channel \{M). If a quantum error- 
correcting code {U e , Ud\ can correct the errors in this channel, the code and 
channel can be used to share pure entanglement between Alice and Bob (*). 
This establishes inequality (||), viz. V M D X {M) > Q{x{M)). 
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5.4 1-EPP QECC proving V x D l {M{ X )) < Q(x) 

In the same style as the last section, we establish the second inequality by 
exhibiting an explicit protocol. The object is to show that, given the exis- 
tence of a 1-EPP acting on the mixed state M(x) obtained from quantum 
channel Xi Alice can successfully transmit arbitrary quantum states |£) to 
Bob. The capacity Q of this quantum channel is the same as D\ for the 
1-EPP; this establishes that the capacity of x is & t least as good as the D\ 
of the corresponding 1-EPP. 




Figure 15: A 1-EPP can be transformed into a QECC. Given x, Alice cre- 
ates mixed states M(x) by passing halves of entangled states <£> + from source 
/ through the channel. Alice and Bob perform a 1-EPP resulting in per- 
fectly entangled states (*) which are then used to teleport |£) safely to Bob, 
completing a QECC. 

In fact, this protocol just involves the application of quantum teleporta- 
tion H mentioned in the introduction. In Fig. [15] we show more explicitly 
the necessary construction, which has already been touched on in Figs. |3] 
and |J Alice and Bob are connected by channel x- Alice arranges to share 
the bipartite mixed state M(x) with Bob by passing halves (the B particles) 
of maximally entangled states ( ( I )+ ) from source I through x to Bob. Then 
Alice and Bob partake in the 1-EPP protocol. We have represented this 
procedure somewhat more generally than is necessary for the hashing-type 
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procedures shown earlier, or for the finite-block protocols to be derived be- 
low. We simply indicate that they must preform two operations U a and Ub, 
and that Alice will perform some measurements M. and pass the results to 
Bob. The measurements which Bob would perform in the hashing protocol 
are understood to be incorporated in Ub- Also, we have accounted for the 
possibility that either Alice or Bob might employ an ancilla a for some of 
their processing operations. 

By hypothesis, this protocol leaves Alice and Bob with nD\ maximally en- 
tangled states (*). They then may use this resource to teleport nD\ unknown 
quantum bits in the state Thus, the net effect is that Alice and Bob, using 
channel x supplemented by one-way classical communication, have a means 
of reliably transmitting quantum data, with capacity D\(M(x))- This is ex- 
actly a QECC on x with a one-way classical side-channel. However Eq. (j5C|) 
(proven in Sec. |5.1| ) states that the same capacity can be obtained without 
the use of classical communication. Thus, the ultimate capacity Q of channel 
X must be at least as great. This establishes the inequality. 



6 Simple quantum error-correcting codes 



For most of the remainder of this paper, we will exploit the equivalence which 
we have established between 1-EPP on M(x) and a QECC on x- 

We note that when the 1-EPP has the property that the unitary trans- 
formations Ub and U4 performed by Bob can be done "in place" (i.e. no 
ancilla qubits need to be introduced, see Fig. the 1-EPP can be trans- 
formed into a particularly simple style of QECC, exactly like the schemes 
which have been introduced by Shor || and have now been extended by 
many others [H], [II], [12|, [L3|, [L4|, [1^, [L6|, which are also all done "in place." 



As we have seen in Figs. |14] and [15|, some versions of 1-EPP and QECC may 
require ancilla a for their implementation. 

The proof of the correspondence between the in-place 1-EPP and in-place 
QECC is immediate, following Sec. |5.4|. The 1-EPP is used to make a QECC 



as in Fig. [L5|. The unitary transformations Ub and U± performed by Bob 
are combined as a Ud and Uj is performed in place by assumption. Thus 



(see Sec. |5.1| ) can also be done in place. 
As a simple consequence of this result, the one-way hashing protocol of 
Sec. |3.2.3| can be reinterpreted as an explicit error correction code, and indeed 
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it does the same kind of job as the recent quantum error correction schemes 
based on linear-code theory of Calderbank and Shor [H| and Steane [II]: 
in the limit of large qubit block size n, it protects an arbitrary state in a 
2 m -dimensional Hilbert space from noise. We note that the hashing protocol 
actually does somewhat better than the linear-code schemes. Di(M(x)), and 
therefore Q(x) (see Eq. (|54])), is higher for hashing than for the linear-code 
scheme, as shown in Figs. H and ||. 

We will make further contact with this other work on error-correction 
coding in finite blocks by showing how finite blocks of EPR pairs can be 
purified in the presence of noise which only affects a finite number of the 
Bell states. When transformed into an error correcting code, this becomes 
a procedure for recovering from a finite number of qubit errors, as in Shor's 
procedure in which one qubit, coded into nine qubits, is safe from any error 
on a single qubit. We develop efficient numerical strategies based on the 
Bell-state approach which look for new coding schemes of this type, and in 
fact we find a code which does the same job as Shor's using only five EPR 
pairs. 

6.1 Another derivation of a QECC from a restricted 
1-EPP 

Another way to derive the in-place QECC from the in-place 1-EPP is to 
exploit the symmetry between measurement and preparation in quantum 
mechanics. Here we will restrict our attention to noise models which are one- 
sided (i.e., Na absent in Fig. ^, or effectively one-sided. An important case 
where the noise is effectively one-sided is when the mixed state M obtained 
in Fig. [5] is Bell-diagonal, i.e., has the form of W (Eq. (p9|)). We can say 
that, subjected to this noise, the pure Bell state is taken to an ensemble of 
each of the four Bell states, with some probabilities. Using the notation of 
Sec. |3.2.1| these are poo, Poi, Pio and pn: 



$ + ) - {V^0\^ + },V^\^-},V^\^ + )^l\^}} = {Rn l n\^ + }}. (63) 



(Here R mn are proportional to the operators {/, a x , a y , a z } of Table [[].) It is 
easy to show that the same mixed state could be obtained if the B particles 
were subjected to a generalized depolarizing channel, and Na were absent. 
More generally, we require that Na,b be such that the resulting M could 
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be obtainable from some channel \\ M = M(x) for some x- This is a fairly 
obvious restriction to make, since we are planning on defining a QECC on this 
effective quantum channel x- Note also that, since the twirling of Sec. RTT 



(item P converts any bipartite mixed state into a Werner state, for some 
purposes any noise can be made effectively one-sided. 

We will now show that under these conditions, the operations performed 
by Alice in Fig. [15] can be greatly simplified. Consider the joint state of the 
A and B particles after Alice has applied the unitary transformation U\ of 
Fig. § as part of the purification protocol, but before the one-sided noise 
Nb has acted on the B particles. The joint state is still a pure, maximally 
entangled state. For convenience, we assume that the source / produces $ + 
Bell states. (If it produced another type of Bell state, some additional simple 
rotations can be inserted in the derivation we are about to give.) The initial 
product of n Bell states may be written 

I^ = 4=eV>A>£- (64) 



On 

z x=0 



After the application of the unitary transformation U\ to Alice's particles, 
the new state of the system is 



i 2 n -12 n -l 

1*)/=^ E E (Ul)x,y\y}A\x)B. (65) 
x =0 y=0 

But notice that by a simple change of the dummy indices, this state can be 
rewritten 

2 n -12 n -l 

l*>/ = -75j?£ E \x)A(UlU y \y) B . (66) 

That is, the unitary transformation applied to the A particles is completely 
equivalent to the same operation (transposed) applied to the B particles. 

Alice's tasks in the 1-EPP protocol are thus reduced to making one- 
particle measurements M. on n—m of the A particles, making Bell measure- 
ments A^4 between the m qubits |£) to be protected and her remaining m 
particles (as in quantum teleportation f|), and applying Uj to the B par- 
ticles before sending them, along with her classical measurement results, to 
Bob. (Recall from the Introduction that m is the yield of good singlets from 
the purification protocol.) 
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However, the n — m one-particle measurements M. can be eliminated en- 
tirely. We use the property of $ + states that if one of the particles is measured 
to be |0) or |1) in the z basis, then the other particle is "collapsed" into the 
same state |l|, ||. So, rather than creating n—m entangled states at /, Alice 
simply prepares n—m qubits in a definite state and sends them directly into 
the lf[ operation. To mimic the randomness of the measurement A4, Alice 
might do n—m coin flips to decide what the prepared state of these B parti- 
cles will be, and send this classical data on to Bob. But this is unnecessary, 
since by hypothesis, the 1-EPP always yields perfect entangled pairs (*), no 
matter what the values of the A4 measurements were. So, Alice and Bob 
may as well pre-agree on some particular definite set of values (e.g., all O's), 
and Alice will always pre-set those B particles to that state. |4Ti| 

The only A particles remaining in the protocol at this point are the m 
particles forming the halves of perfect EPR pairs with Bob, and which are 
immediately used for teleportation to Bob. But we note that, following the 
usual rules of teleportation, the measurement M.4 causes the corresponding 
B particles, immediately after their creation at source /, to be in the state |£) 
(if the measurement outcome were 00), or a rotated version, a x ,y,z\Cl (for the 
other measurement outcomes). Again, the protocol should succeed no matter 
what the value of this measurement; therefore, if Alice and Bob pre-agree 
that this classical data should be taken to have the value 00, then Alice can 
eliminate the A particles entirely, eliminate the preparation I of entangled 
states, and simply feed in the |£) states directly as B particles into the Uf 
transformation. (Bob also does the U4 operation of Fig. |3] appropriate for 00, 
namely, a no-op.) 

Finally we step back to see the effect that this series of transformations 
has produced, as summarized in Fig. 0. All use of bipartite states J, and 
the corresponding A particles, has been eliminated, along with all the mea- 
surement results transmitted to Bob. The net effect is that Alice has taken 
the m-qubit unknown quantum state |£) along with n — m "blank" qubits, 
processed them with Uf, and sent them on channel \ to Bob. He is able 
to use his half of the protocol, without any additional classical messages, to 
reconstruct This, of course, is precisely the in-place QECC that we want. 
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Figure 16: The one-way purification protocol of Fig. |] may be transformed 
into the quantum-error-correcting-code protocol shown here. In a QECC, an 
arbitrary quantum state along with some qubits which are originally set 
to |0), are encoded in such a way by lf[ that, after being subjected to errors 
Nb, decoding {7 2 followed by measurement A4, followed by final rotation U3, 
permits an exact reconstruction of the original state 



6.2 Finite block-size purification and error correcting 
codes 

We have now shown that Bell-state purification procedures can be mapped 
directly into quantum error correcting codes. This gives an alternative way 
to look for quantum error correction procedures within the purification ap- 
proach. This can be both analytically and computationally useful. In fact, 
we can take over everything which we obtained via the hashing protocol of 
Sec. [3.2.3| , in which Alice and Bob perform a sequence of unilateral and bilat- 
eral unitary operations to transform their bipartite state from one collection 
of Bell states to another, in order to gain information about the errors to 
which their particles have been subjected. 

In this section we will show that this approach can also be used to do 
purification, and thus error correction, in small, finite blocks of qubits, in the 
spirit of much of the other recent work on QECC [|, |, [11], 0, f§ [TJ, [T|, [!€ 



In these procedures the object is slightly different than in the protocols which 
employ asymptotically large block sizes: Here, we wish purify a finite block of 
n EPR pairs, of which no more than t have interacted with the environment 
(i.e., been subjected to noise). The end result is to be m < n maximally 
entangled pairs, for which F — 1 exactly. The explicit result we present 
below will be for n = 5, m — 1, and t — 1. This protocol thus has the same 
capability as the one recently reported by Laflamme et al. [P^j , although the 
quantum network which we derive below is simpler in some respects. We are 
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still investigating the extent to which our two protocols are equivalent. 

The general approach will be the same as in Sec. |3|, however, our earlier 
emphasis was on error correction in asymptotically large blocks of states. 
To deal with the finite-block case, we will need a few small but important 
modifications: 

• There will again be a set C of possible collections of Bell states after the 
action of the noise Nb] but rather than being a "likely set" defined by 
the fidelity of the channel, we will characterize the noise by a promise 
that the number of errors cannot exceed a certain number t. Cases 
with t + 1 errors are not just deemed to have low probability; they are 
declared to be disallowed, following Shor ||. 

• The set L will have a definite, finite size; if the size of the Bell state 
block is n and the number of erroneous Bell states to be corrected is t, 



then the size of the set is [13] 

Borrowing the traditional language of error correction, each member 
of the set, indexed by i, 1 < i < S, defines an error syndrome. The 
"3" in Eq. (|67| ) corresponds to the number of possible incorrect Bell 
states occurring in the evolution of Eq. (|63|) : there is either a phase 
error ($ + — > $~), an amplitude error ($ + — ► or both ($ + — ► 
EDI- ^ has been noted HlOj |i~3"H that correcting these three types 
of error is sufficient to correct any arbitrary noise to which the quantum 
state is subjected which we prove in Appendix |B|. 

• The object of the error correction is slightly different than in Sec. ^ 
in the earlier case it was to find a protocol where the fidelity of the 
remaining EPR pairs approached unity asymptotically as n — > oo. In 
the finite-block case, the object is to find a protocol such that the 
fidelity attains exactly 100%, that is, m good EPR pairs are guaranteed 
to be recoverable from the original set of n Bell states for every single 
one of the S error syndromes. 

Let us emphasize again that, in the purification language which we have 
developed, the quantum error correction problem has been turned into an 
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entirely classical exercise: given a set of n Bell states, we use the operations 
of item H] in Sec. 3T to create a classical Boolean function which maps these 
Bell states onto others such that, for all S of the error syndromes, the first 
m Bell states are always the same when the measurement results on the 
remaining n — m Bell states are the same. 

We will develop this informal statement of the problem in a more formal 
mathematical language. First, recall the code which we introduced for the 
Bell states in item 
states 



of Sec. [3J] in which, for example, the collection of Bell 
is coded as the 6-bit word 001000. As in our hashing-protocol 



discussion (Sec. |3.2.3| ), we denote such words by x^>, where the superscript i 



denotes the word appropriate for the i th error syndrome. These words have 
2n bits, and we will sometimes denote by x$ the k th bit of the word. 




B 



EPP12 



1 1 1 












; N B : ; 




1 — f~ 

1 i 


■u 3 





V 



(i) = 



my 



r(i) 



W 



y 

(i) w '(i) 



Figure 17: The 1-EPP of Fig. [] marked with the notation used in this section. 

Alice and Bob subject to the unitary transformations U\ and L^- 
They are confined to performing sequences of the unilateral and bilateral 
operations introduced in Table ^ In particular, they can do either: 

1. a bilateral XOR, which flips the low (right) bit of the target iff the low 
bit of the source is 1, and flips the high (left) bit of the source iff the 
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high bit of the target is 1; 



2. a bilateral 7r/2 rotation B y of both spins in a pair about the y-axis, 
which interchanges the high and low bits; 

3. a unilateral (by either Alice or Bob) 7r rotation o z of one spin about 
the z-axis, which complements the low bit; or 

4. a composite operation a x B x , where the a x operation is unilateral and 
the B x is bilateral; the simple net effect of this sequence of operations 
is to flip the low bit iff the high bit is one. 

It is easy to show that with these four operations, Alice and Bob can do 
anything which they can do with the full set of operations in Table [I]. In 
our classical representation, the effect of such a sequence of operations is to 
apply a classical Boolean function L u to yielding a string iyW; 



w 



(i) = L u (x^). (68) 



We use the symbol L u for this function because, with the operations that 
Alice and Bob have at their disposal, L u is constrained to be a linear, re- 
versible Boolean function. This is easy to show for the sequences of the four 
operations given above. Note, however, that not all linear reversible Boolean 



functions are obtainable with this repertoire. A linear Boolean function [[44 
can be written SIS db matrix equation 

w {i) = M u x {i) + b. (69) 

Here the matrix M and the vector b are boolean-valued (e {0,1}), and 
addition is defined modulo 2. Reversibility adds an additional constraint: 
det(M) = 1 (modulo 2). In a moment we will write down the condition 
which the set of w^' must satisfy in order for purification to succeed. 

The next step of purification is a measurement M. of n — m of the Bell 
states. As discussed in item ^| of Sec. |3~TT| , after learning Alice's measurement 
result, Bob can deduce the low bit of each of the measured Bell states. If 
we write these measurement results for error syndrome i as another boolean 
word (of length n — m), the measurement can be expressed as another 
linear boolean function: 

= M m w {i) . (70) 
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The matrix elements of M m are 

(M m )kl = £fc,2(m+fc)- (71) 

The state of the remaining unmeasured Bell states is coded in a truncated 
word w' of length 2m: 

w '® = ( Wl w 2 ...w 2m )M. (72) 

We now have all the machinery to state the condition for a successful 
purification. The object is to perform a final rotation {7 3 on the state coded 
by w' and restore it, for every error syndrome i, to the state 00. ..0. Whatever 
w' is, such a restoring U 3 is always available to Bob; for each Bell state, he 
does the Pauli rotations: 

Bell state U3 transformation 

00 I (do nothing) 

01 <y z (73) 

10 a x 

11 Gy. 

But Bob must know which of these four rotations to apply to each of the 
remaining m Bell states. The only information he has on which of them to 
perform are the bits of the measurement vector v®. This information will 
be sufficient, if for every error syndrome which produces a distinct w', v is 
distinct; in this case, Bob will know exactly which final rotation U3 to apply. 

This, then, is our final condition for successful purification. In more 
mathematical language, we require an operation L u for which 

Vij w' {i) ^ w' {j) => v {i) ^ v®. (74) 



We will shortly show the results of a search for L u which satisfy Eq. (u_ 



But first, we touch a point which has been raised in the recent literature: [11 



[T0| , |13| , |12|1 Bob will obviously know which rotation U3 to apply if from the 
measurement he learns the precise error syndrome, that is if for each error 
syndrome the measurement outcome is distinct. This "condition for learning 
all the errors" may be stated mathematically in a way parallel to Eq. (|73|): 

Vyi^j v®j:v®. (75) 

This condition is obviously sufficient for successful error correction; however, 
it is more restrictive than Eq. ([74]), and it is not a necessary condition. If 
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Eq. ( |75|) were a necessary condition for error correction, then a comparison 
of the number of possible distinct measurements v W with the number of error 
syndromes S leads ||T3|, [T2|] to a restriction on the block size in which a certain 
number of errors can be corrected: 

s = E 3P Q ^ 2 """- ( 76 ) 

It is this bound which is attained, asymptotically, by the hashing and breed- 
ing protocols above. However, Eq. ([73]) puts no obvious restriction on the 
block size in which error correction can succeed, suggesting that the bound 
Eq. ( |76"D can actually be exceeded. For example, if the transformation L u 
were permitted to be any arbitrary boolean function, then it would be ca- 
pable of setting w' = 00. ..0 for every syndrome i, in which case no error 
correction measurements v would be needed. 

However, L u is very strongly constrained in addition to being a linear, 
reversible boolean function, and we are left uncertain to what degree the 
bound Eq. ([76]) may be violated. For the small cases which we have explored 
below, in which one Bell state is restored from single-qubit errors (m — 1, 
t = 1), we find that the bound of Eq. ([76]) is not exceeded. All solutions which 
we find which satisfy Eq. (|7^) also happen to identify every error syndrome 
uniquely (Eq. ([75])). The present work, therefore, does not demonstrate that 
Eq. ([74]) actually leads to more power error-correction schemes than Eq. 
(|75|). However, Shor and Smolin|3^| have recently exhibited a family of new 



protocols which, at least asymptotically for large n, exceed the bound Eq. 
(|76|) by a small but finite amount. 



6.3 Monte Carlo results for finite-block purification 
protocols 

For the single-error (t = 1), single-purified-state (m = 1) case, we have per- 
formed a Monte-Carlo computer search for unitary transformations U\ and 
U2- The program first tabulates the for all the allowed error syndromes 
i, as shown in Table |3|. (For the case of t — 1 there are S = 3n + 1 error 
syndromes, since either of the n Bell states could suffer three types of error, 
plus one for the no-error case.) The program then randomly selects one of the 
four basic operations enumerated above, and randomly selects a Bell state 
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or pair of Bell states to which to apply the operation. The program then 
checks whether the resulting set of states satisfies the error-correction 
condition of Eq. ([7ip. If the answer is no, then the program repeats the 
procedure, adding another random operation. If the answer is yes, the pro- 
grams saves the list of operations, and starts over, seeking a shorter solution. 
Two "shortness" criteria were explored: fewest total operations, and fewest 
total BXOR's (since two-bit operations could be the more difficult ones to 
implement in a physical apparatus f32fl ). 

A simple argument akin to the one of Sec. |] shows that error correction 
in a block of 2 (t = 1, m = 1, n = 2) is impossible. We performed an 
extensive search for n = 3 and n = 4 codes; it would not be possible to 
detect the complete error syndrome for these cases (Eq. ([76])), but it would 
appear a priori possible to satisfy Eq. ([741). Nevertheless, no solutions were 
found, strongly suggesting that, for this case, n = 5 is the best block code 
possible [[12|]. Knill and Laflamme have recently proved this [|0 . 

Our search found many solutions for n = 5 with similar numbers of 
quantum gate operations. The minimal network which was eventually found 
was one with 11 operations, 6 of which were BXORs. Here we present a 
complete analysis of a slightly different solution, which involves 12 operations, 



7 of which are BXORs. The gate array for this solution is shown in Fig. |1S 



The complete action of U\ and Ui produced by this quantum network is given 
in Table |3|. 

Note that, as indicated above, this code not only satisfies the actual error- 
correction criterion Eq. (fT3|), but it also satisfies the stronger condition Eq. 
(|75|); all the error syndromes are distinguished by the measurement results 



It is interesting to note, as a check, that the tabulated transformation is 
indeed a reversible, linear boolean operation. The reader may readily confirm 
that the results of Table |3] are obtained from the linear transformation Eq. 
with 
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Table 3: Possible initial Bell states and the resulting final state after the gate 
array of Fig. ITS] has been applied. 
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Source 



Bilateral XOR 



(B — Target 



Bilateral Ry Rotation 



Unilateral C7 Z Rotation 



Figure 18: The quantum gate array, determined by our computer search, 
which protects one qubit from single-bit errors in a block of five. "Bilateral" 
and "unilateral" refer to whether both Alice and Bob, or only Alice (or 
Bob), perform the indicated steps in the 2-EPP; in the QECC version, it 
corresponds to whether the operation is done in both coding and decoding, 
or in just the coding (or decoding) operations. 
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6.4 Alternative conditions for successful quantum er- 
ror correction code 

While all of our work has involved deriving QECCs using the 1-EPP con- 
struction, it is possible, and instructive, to formulate the conditions for a 
good error correcting code directly in the QECC language. As Shor first 
showedQ, in this language the requirements become a set of constraints 
which the subspace into which the quantum bits are encoded must satisfy. 
In the course of our work we derived a set of general conditions for the case 
of error-correcting a single bit (m = 1). They are quite similar to conditions 
which other workers have formulated recently |13|, [45] . Knill and Laflamme 
have recently obtained the same condition p}| . 

We will assume that only one qubit is to be protected, but the general- 
ization to multiple qubits is straightforward. Suppose a qubit is encoded (by 

in Fig. ^) as a state 

\Z) = a\v )+/3\v 1 ), (79) 
where a and f3 are arbitrary except for the normalization condition 

M 2 +|/3| 2 = l, (80) 

and \vq) and \v\) are two basis vectors in the high-dimensional Hilbert space 
of the quantum memory block. Can \vq) and \vi) be chosen such that, after 
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the quantum state is subjected to Werner-type errors, the original quantum 
state can still be perfectly reconstituted as the state of a single qubit, 

\£ f ) = a\0) + /3\1)7 (81) 

We shall derive the conditions which \vo) and \vi) must satisfy in order for 
this to be true. 

We specify the action of the noise as a mapping of the original quantum 
state into an ensemble of unnormalized state vectors given by applying the 
linear operators Ri to the original state vector: 

10 - R-im- (82) 

For each error syndrome i there is an (unnormalized) operator Ri specify- 
ing the effect of the noise, as in Eq. (|63]) . For single-bit errors, the R^s are 
just proportional to a a x , <r y , or a z operator applied to one of the quantum- 
memory qubits, as discussed below. Two-bit errors would involve operators 
like Ri = cr" y,z a x,y,z applied to two different qubits a and (3, and so forth. 
Equivalently to Eq. (j82|), the effect of the noise Nb in Fig. [16] can be ex- 
pressed as a ensemble of normalized state vectors with their associated 
probabilities po 

\o - fa, 16)} = {(eifltm i S= }• (83) 

The Werner noise can be set up so that the p;'s are the probabilities that 
the environment "measures" the i th outcome of a pointer or ancilla space. We 
can evaluate the probability pi (for the i th outcome of these measurements) 
for the state Eq. (|79|) using the expression in Eq. (|83|) : 

P, = (a*,P*) x ( ^ffil ) x ( a A . (84) 

V ; V (vi\RlRi\v ) (v^RlRilv,) J \P J K J 

We have used the linearity of the operators Ri. The matrix notation used in 
Eq. (|84T ) will prove useful in a moment. 

The first, necessary condition which must be satisfied in order that the 
state may be reconstituted as in Eq. flSi"D is that the environment producing 
the Werner noise can acquire no information about the initial quantum state 
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by doing this ancilla measurement. This will be true so long as pi in Eq. 
( |54D is not a function of the state vector coefficients a and (3. It may be 
noted that the right hand side of Eq. (Q) has the form of the expectation 
value of a 2 x 2 Hermitian operator in the state (a,/3) T . It is a well-known 
theorem of linear algebra that such an operator can only have an expectation 
value independent of the state vector (a,f3) T iff the Hermitian operator is 
proportional to the identity operator. This gives us the first two conditions 
that the state vector may be recovered exactly: Vj, 

(v \R\Ri\vo) = (vi\R\Ri\vi) =pi, 

(v^Rilvo) = 0. (85) 

If this condition is satisfied, then the ensemble of state vectors in Eq. (j32|) 
can be written in the simplified form: 

a\v )+(3\vi} -> {pi, — }. (86) 



Now, given that the environment learns nothing from the measurement, a 
further, sufficient condition is that there exist a unitary transformation (U2) 
which takes each of the state vectors of Eq. fl56|) to a vector of the form: 

' (aRM + PRi\ Vl )) -> (a|0)+/3|l))|a i ). (87) 



(v \RlRi\v ) 

Here |a») is a normalized state vector of all the qubits excluding the one 
which will contain the final state Eq. (pl|). Because of unitarity, the angle 
between any two state vectors must be preserved. Taking the dot product of 
the state vectors resulting from two different syndromes i and j, and equating 
the result before and after the unitary operation gives: 

1 



(volRlRilvoj^ivolRjRjlvo) 

, / (v \RlR,\v ) (volRlRM) n 
~ ^ (v^RjRjlvo) (v^RlRj^) j 

\at\ 2 (ai\aj) + \(3\ 2 (ai\aj) = (ai\aj). (88) 




In the last part we have used the normalization condition to eliminate a and 
(3. Now, since the right hand side of Eq. (BH), and the pref actor of the left 
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hand side, are independent of a and /3, so must be the expectation value of 
the 2x2 Hermitian operator. We again conclude that this Hermitian oper- 
ator must be proportional to the identity operator, and this gives the final 
necessary and sufficient conditions |46[ for successful storage of the quantum 



data: V$j, 

(v \RlRj\v ) = (vxlRlRjlvx), (89) 
(v^RjRjlvo) = 0. (90) 

□ 

For the specific 5-qubit code described above, we found (by another, sim- 
ple computer calculation) that the two basis vectors of Eq. (|79|) are: 

\v ) oc (- |00000) — 111000> - |01100) - |00110) - |00011) - (91) 

|10001) +|10010) + |10100) + |01001) + |01010) + 
|00101)+ |11110) +|11101) + |11011) + |10111) + |01111)) 

i.e. a superposition of all even-parity kets, with particular signs, and 

\vi) = the corresponding vector with and 1 interchanged. (92) 

It is easy to confirm that this pair of vectors satisfies the conditions Eqs. 
(|8~9"D and (|9Q|) . It is interesting to note that these two vectors do not span the 



same two-dimensional subspace as the ones recently reported by Laflamme 



et al. [12]; but it has recently been shown that they are related to one another 



by one bit rotations 

6.5 Implications of error-correction conditions on chan- 
nel capacity 



Knill and Laflamme f40|1 have used the error correction conditions (Eqs. (p9|) 
and (|90"|)) to provide a stronger upper bound for Q and D\ than the one of 
Sec. [| by showing that Di — when F = 0.75. We indicate this on Figs. |8] 



and |9| using our channel-additivity result of Sec. [5]2] to extend this to the 
linear bound shown. Their proof is as follows: write the coded qubit basis 
states (cf. Eqs. (0) and p3)) as 



^) = E«^>=E«U^>- (93) 



y.z 

y.z 
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Here x stands for an n bit binary number, and y : z stands for a parti- 
tioning of x into a 2t-bit substring y and an (n — 2i)-bit substring z. (The 
partitioning may be arbitrary, and need not be into the least significant and 
most significant bits.) Knill and Laflamme then consider the reduced density 
matrices on the y and the z spaces: 

Pn-2t = E (*2 1 (94) 

y,zi,z2 

P2t= E <:,<:j2/l)(2/2| (95) 

2/1:3/2,2 

Knill and Laflamme then prove two operator equations. First: 

9l-2tP\-2t = 0. (96) 

This is proved by using the condition for a successful error- correct ion code 
(Eq. fl90l)), where the linear operator Ri operates on a set of t bits, and Rj 
operates on a different set of t bits. (These i?'s should be taken as projection 
operators in this proof.) Likewise, by applying Eq. (|89|) with the same 
operators Ri and Rj, they prove 

Pit = Pit- (97) 

These two equations give a contradiction when the two substrings are of the 
same size, because it says that reduced matrices are simultaneously orthog- 
onal and identical. This says that no code can exist if 2t = n — 2t, which 
corresponds to F = 1 — t/n = 0.75. As a bonus, these results give an in- 
teresting insight into the behavior of coded states: no measurement on 2t 
qubits can reveal anything about whether a or a 1 is encoded, while there 
exists a measurement on n — 2t qubits which will distinguish with certainty 
a coded from a coded 1. 

This result shows that the lowest fidelity Werner channel with finite ca- 
pacity must have F > 0.75. Call that fidelity Fq. Consider a channel with 
fidelity F between F and 1. The capacity of this channel is no greater than 
that of a composite channel consisting of a perfect channel used a fraction 
^Zp° of the time and a channel with fidelity Fo used °f t ne time be- 
cause the first channel is the same as the composite channel provided one is 
unaware of whether the fidelity is 1 or Fo on any particular use of the chan- 
nel. (This construction is akin to that of Sec. |j.) By the channel additivity 
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argument of Sec. |5.2| the capacity of the composite channel, which bounds 
the capacity of the fidelity F channel, cannot exceed jE^- Since Fq cannot 
be below 0.75 we obtain the straight-line bound 

Q = D l < AF - 3 , (98) 

as shown in Figs. § and |9|. 



7 Discussion and Conclusions 

There has been an immense amount of recent activity and progress in the 
theory of quantum error-correcting codes, including block codes with some 



error-correction capacities in blocks of two three |T3], [LJ], and four[|l 



Codes which completely correct single-bit errors have now been reported 



for block sizes of five as in the present work[|IJ|], seven fllTi , eight |15|, and 
nine0; this is in addition to the work using linear-code theory of families 
of codes which work up to arbitrarily large block sizes[H| [Ll|]. A variety 
of subsidiary criteria have been introduced, such as correcting only phase 
errors, maintaining constant energy in the coded state, and correction by a 
generalized watchdogging process. Much of this work can be expressed in 
entanglement purification language, in some cases more simply. 

Our results highlight the different uses to which a quantum channel may 
be put. When a noisy quantum channel is used for classical communication, 
the goal — by optimal choice of preparations at the sending end, measure- 
ments at the receiving end, and classical error-correction techniques — is to 
maximize the throughput of reliable classical information. When used for 
this purpose, a simple depolarizing channel from Alice to Bob has a positive 
classical capacity C > provided it is less than 100% depolarizing. Adding 
a parallel classical side channel to the depolarizing quantum channel would 
increase the classical capacity of the combination by exactly the capacity of 
the classical side channel. 

When the same depolarizing channel is used in connection with a QECC 
or EPP to transmit unknown quantum states or share entanglement, its 
quantum capacity Q is positive only if the depolarization probability is suf- 
ficiently small (< 1/3), and this capacity is not increased at all by adjoining 
a parallel classical side channel. On the other hand, a classical back channel, 
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from Bob to Alice, does enhance the quantum capacity, making it positive 
for all depolarization probabilities less than 2/3. 

It is instructive to compare our results to the simpler theory of noiseless 
quantum channels and pure maximally-entangled states. There the transmis- 
sion of an intact two-state quantum system or qubit (say from Alice to Bob) 
is a very strong primitive, which can be used to accomplish other weaker ac- 
tions, in particular the undirected sharing of an ebit of entanglement between 
Alice and Bob, or the directed transmisson of a bit of classical information 
from Alice to Bob. (These two weaker uses to which a qubit can be put are 
mutually exclusive, in the sense that k qubits cannot be used simultaneously 
to share i ebits between Alice and Bob and to transmit m classical bits from 
Alice to Bob if £ + m > k. [g|]) 

A noisy quantum channel Xi if it is not too noisy, can similarly be used, in 
conjunction with QECCs, for the reliable transmission of unknown quantum 
states, the reliable sharing of entanglement, or the reliable transmission of 
classical information. Its capacity for the first two tasks, which we call the 
quantum capacity Q(x), is a lower bound on its capacity C(x) f° r t ne third 
task, which is the channel's conventional classical capacity. 

Most error-correction protocols are designed to deal with error processes 
that act independently on each qubit, or affect only a bounded number of 
qubits within a block. A quite different error model arises in quantum cryp- 
tography, where the goal is to transmit qubits, or share pure ebits, in such a 
way as to shield them from entanglement with a malicious adversary. Tra- 
ditionally one grants this adversary the ability to listen to all classical com- 
munications between the protagonists Alice and Bob, and to interact with 
the quantum data in a highly correlated way designed to defeat their error- 
correction or entanglement-purification protocol. It is not yet known whether 
protocols can be developed to deal successfully with such an adversarial en- 
vironment. 

Even for the simple error models which introduce no entanglement be- 
tween the message qubits, there are still a wide range of open questions. As 
Fig. |§| has shown, we still do not know what the attainable yield is for a 
given channel fidelity; but we are hopeful that the upper and lower bounds 
we have presented can be moved towards one another, for both one-way and 
two-way protocols. 

Improving the lower bounds is relatively straightforward, as it simply 
involves construction of protocols with higher yields. An important step 
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towards this has been the realization that it is not necessary to identify the 
entire error syndrome to successfully purify. This has permitted the lower 
bound for one-way protocols (and thus for QECCs) to be raised slightly above 
the D H curve of Fig. | (see Ref. [||). 

Improvement of the upper bounds is more problematical. For two-way 
protocols, we presently have no insight into how this bound can be lowered 
below E. Characterizing Di, D 2 and E for all mixed states would be a great 



achievement j|9| , but even that would not necessarily provide a complete the- 
ory of mixed state entanglement. Such a theory ought to describe, for any 
two bipartite states M and M', the asymptotic yield with which state M' can 
be prepared from state M by local operations, with or without classical com- 
munication. In general, the most efficient preparation would probably not 
proceed by distilling pure entanglement out of M', then using it to prepare 
M; it is even conceivable that there might be incomparable pairs of states, 
M and M' such that neither could be prepared from the other with positive 
yield. 

Surprisingly, basic questions about even the classical capacity of quantum 
channels remain open. For example, it is not known whether the classical 
capacity of two parallel quantum channels can be increased by entangling 
their inputs. 

For us, all of this suggests that, even 70 years after its establishment, we 
still are only beginning to understand the full implications of the quantum 
theory. Its capacity to store, transmit, and manipulate information is clearly 
different from anything which was envisioned in the classical world. It still 
remains to be seen whether the present surge of interest in quantum error 
correction will enable the great potential power of quantum computation to 
be realized, but it is clearly a step in this direction. 
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A Appendix: Implementation of Random Bi- 
lateral Rotation 



In this appendix we show how an arbitrary density matrix of two particles 
can be brought into the Werner form by making a random selection, with 
uniform probabilities, from a set of 12 operations {U{} which involve identical 
rotations on each of the two particles. (Thus, the rotations Ui are members of 
a particular SU(2) subset of SU(4).) After such a set of rotations the density 
matrix is transformed into an arithmetic average of the rotated matrices: 

1 N 

M T = -Y J U}MU % . (99) 

i=l 

N will be 12 in the example we are about to give. The 4x4 density matrix 
M, expressed in the Bell basis, has three parts which behave in different ways 
under rotation: 1) the diagonal singlet matrix element, which trans- 

forms as a scalar; 2) three singlet-triplet matrix elements, which transform 
as a vector under rotation; and 3) the 3x3 triplet block, which transforms as 
a second-rank symmetric tensor. In the desired Werner form the vector part 
of the density matrix is zero, and the symmetric second-rank tensor part is 
proportional to the identity. 

The mathematics of this problem is the same as that which describes the 
tensor properties of a large collection of molecules as would occur in a liquid, 
glass, or solid. In the case of a liquid, all possible orientations of the molecules 
occur. Because of the orientational averaging (mathematically equivalent to 



Eq. (p9|), where the sum runs over all SU(2) operations), vector quantities 
become zero (e.g., the net electric dipole moment of the liquid is zero), while 
second-rank tensor quantities become proportional to the identity (e.g., the 
liquid's dielectric response is isotropic) Q . 

But following the molecular-physics analogy further, we know that crys- 
tals, in which the molecular units only assume a discrete set of orientations, 
can also be optically isotropic and non-polar. It is also well known that only 
cubic crystals have sufficiently high symmetry to be isotropic. This suggests 
that if the sum in Eq. (|99|) is over the discrete subgroup of SU(2) correspond- 
ing to the symmetry operations of a tetrahedron (the simplest object with 
cubic symmetry), then the desired Werner state will result; and this turns 
out to be the case. 
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The bilateral rotations B XiVtZ introduced in Sec. |3.2.3| are the appropriate 
starting point for building up the desired set of operations. In fact they 
correspond to 4-fold rotations of a cube about the x-, y-, and z-axes. This is 
not evident from their action on Bell states as shown in Table [I] where they 
appear to correspond to 2-fold operations. This is because this table does 
not show the effect of the B rotations on the phase of the Bell states. Phases 
are not required in the purification protocols described in the text, because 
the density matrix in all these cases is already assumed to be diagonal, so 
that the phases do not appear. But for the present analysis they do, so we 
repeat the table with phases in Table |j. 







source 








$" 


$+ 




I 




$- 


$+ 




B x 




$- 






By 






$+ 


$- 


B z 




z$+ 




^+ 



Table 4: Modification of part of Table [I], including the phase-changes of the 
Bell states. 



When presented in this way, it is evident that these operations are 4-fold 
(that is, Bf = I) , and indeed, they are the generators of the 24-element 
group of rotations of a cube, known as the group O in crystallography |50| . 
(It is also isomorphic to S4, the permutation group of 4 objects.) 

Now, as mentioned above, only the rotations which leave a tetrahedron 
invariant are necessary to make the density matrix isotropic. This is a 12- 
element subgroup of O know as T (which is isomorphic to A4, the group of all 
even permutations of 4 objects). Written in terms of the B^s, these twelve 
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operations are 



/(identity) 



B X B X 

ByBy 

B Z B Z 

B X By 



{Ui}= B u yB u Z M^W F (100) 



B Z B X 
B y B x 

B x B y B x B y 
B y B z B y B z 
B Z B X B Z B X 
By B x ByB x . 

It is easily confirmed by direct calculation, using Table f|, that this set of 12 
{Ui}, when applied to a general density matrix M in Eq. (f39|), results in a 
Werner density matrix Wp of Eq. (|T7p . 

There are a couple of special cases in which the set of rotations can be 
made simpler. If it is only required that the state M be taken to some 
Bell-diagonal state W (Eq. (^9|)), then a smaller subset, corresponding to 
the orthorhombic crystal group D2 (an abelian four-element group) may be 
used: 

/ 

{U l }= B * B R X M^W (101) 
B Z B Z . 

Finally there is another special case, which arises in some of our purification 
protocols, in which the density matrix W is already diagonal in the Bell 
basis, but is not isotropic (i.e., the triplet matrix elements are different from 
one another). To carry W into Wp, the discrete group in Eq. ( PU| ) can be 
again be reduced, in this case to the three-element group with the elements 

/ 

{U t }= B x B x B x B y W^W F (102) 
B X B X B X B Z . 

One further feature of any set {Ui} that takes the density matrix to the 
isotropic form Wf, which can be used to simplify the set, is that the modi- 
fied set {RUi}, for any bilateral rotation R, also results in a Werner density 
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matrix Wf in Eq. (0). Since the density matrix is already isotropic, any 
additional rotation R leaves it isotropic. (A cubic crystal has the same dielec- 
tric properties no matter how it is rotated.) For example, if we take R = B x , 
the three operations of Eq. ( |102| ) take the form 

B x 

{17,}= B y W^W F (103) 
B z . 

B Appendix: General-noise error correction 

In this appendix we present an argument, based on twirling, that correcting 
amplitude and phase errors corrects every possible error. We have derived 
finite-block purifications under the assumption that the pairs which are af- 
fected by the environment are subject to errors of the Werner type, in which 
the Bell state evolves into a classical mixture of Bell states (see Eq. (|63"|)). 
But the most general effect which noise can have on a Bell state appears very 
different from the Werner noise model, and is characterized by the 4x4 den- 
sity matrix M into which a standard Bell state $ + evolves (see Fig. |^). Many 
additional parameters besides the fidelity F = ($ + |M|$ + ) are required for 
the specification of this general error model. A general 4x4 density matrix 
of course requires 15 real parameters for its specification. However, not all 
of these parameters define distinct errors, since any change of basis by Alice 
or Bob cannot essentially change the situation (in particular, the ability to 
purify EPR pairs cannot be changed). This says that 6 parameters, those 
involved in two different SU(2) changes of basis, are irrelevant. But this 
still leaves 9 parameters which are required to fully specify the most general 



independent-error model|5T|]. How then does correction of just amplitude, 
phase, and both, deal with all of these possible noise conditions, character- 
ized by 9 continuous parameters? 

To show this we will again introduce the "twirl" of Fig. |5], although in the 
end it will be removed again. Recall that any density matrix is transformed 



into one of the Werner type by the random twirl. (See item [5] of Sec. 3T 



for the method of twirling the $ + state.) Thus, if twirling is inserted as 



shown in Fig. [T^, or in the corresponding places in Fig. |3], then the channel 
is converted to the Werner type, and the error correction criteria we will 
describe in the next section will work. 
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Figure 19: If the state is subject to the initial and final rotations R T and 



R (the "twirl" T) in the QECC of Fig. [16], then the action of the noise N B 
is guaranteed to be of a simple form in which only three types of errors, 
amplitude, phase, or amplitude-and-phase, can occur on each qubit |I3| ; this 
corresponds to the Werner mixed state Wp in the purification picture. As 
described in the text, for finite-block error correction the QECC protocol will 
succeed even if the twirl T is not performed. 



But let us consider the action of the twirl in more detail. Let us personify 
the twirl action T in Fig. [19] (or in the corresponding purification protocol 
of Fig. H|, as in Fig. |5|) by saying that an agent ( "Tom" ) performs the twirl 
for the n bits by randomly choosing n times from among one of 12 bilateral 
rotations tabulated in Appendix [A]. Tom makes a record of which of these 
12 n actions he has taken; he does not, however, reveal this record to Alice or 
Bob. Without this record, but with a knowledge that Tom has performed this 
action, Alice and Bob conclude that the density matrix of the degraded pairs 
has the Werner form. They proceed to use the protocol they have developed 
to purify m EPR pairs perfectly. Now, suppose that after this has been 
done, Tom reveals to Alice and Bob the twirl record which he has heretofore 
kept secret. At this point, Alice and Bob now have a revised knowledge of 
the state of the particle pairs which entered their purification protocol; in 
fact, they now know that the density matrix is just some particular rotated 
version of the non- Werner density matrix in which the environment leaves the 
EPR pairs. Nevertheless, this does not change the fact that the purification 
protocol has succeeded. Indeed, we must conclude that it succeeds for each of 
the 12™ possible values of Tom's record, and in particular it succeeds even in 
the case that each of Tom's n rotations was the identity operation. Thus, the 
purification protocol works on the original non- Werner errors, even if Tom 
and his twirling is completely removed. This completes the desired proof, 
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and we will thus develop protocols for correcting Werner type errors, Eq. 
(|63D , keeping in mind their applicability to the more general case. 

A slight extension of the above arguments shows that asymptotic large- 
block purification schemes such as our hashing protocol of Sec. |3.2.3| are also 
capable of correcting for non- Werner error. Consider a non Bell-diagonal 
product density matrix of n particles, M = (M) n , whose fidelity is such 
that, after twirling, it can be successfully purified, resulting in entangled 
states whose final fidelity with respect to perfect singlets approaches 1 in 
the limit n — > oo. The hashing protocol produces truly perfect singlets of 
unit fidelity for a likely set £ of error syndromes containing nearly all the 
probability. This means that we can write M = (1 — e)M' + e<5M, where M' 
can be purified with exactly 100% final fidelity. By the above arguments, M' 
can be successfully purified even if twirling is not performed. Since e — > as 
n — > oo, the original state M will also be purified to fidelity approaching 1, 
even without twirling. 
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